Modeller subjectivity and calibration impacts on hydrological model applications: An event-based...

15
Modeller subjectivity and calibration impacts on hydrological model applications: An event-based comparison for a road-adjacent catchment in south-east Norway Zahra Kalantari a,b, , Steve W. Lyon b , Per-Erik Jansson a , Jannes Stolte c , Helen K. French d , Lennart Folkeson a , Mona Sassner e a Department of Land and Water Resources, Royal Institute of Technology/KTH, SE-10044 Stockholm, Sweden b Department of Physical Geography and Quaternary Geology, Stockholm University, SE-106 91 Stockholm, Sweden c Norwegian Institute for Agricultural and Environmental Research, Bioforsk, Soil and Environment Division, NO-1432 Ås, Norway d Department of Plant and Environmental Sciences, Norwegian University of Life Sciences, NO-1432 Ås, Norway e DHI Sverige AB, SE-111 29 Stockholm, Sweden HIGHLIGHTS We compared 4 hydrological models regarding their capabilities to predict peak ow. The efciency of models can vary based on the hydroclimatic conditions. Modeller subjectivity plays an important role in model performance. Models used in designing road must represent seasonal hydrological behaviour. Model calibration is a complicated process that is sensitive to modeller subjectivity. abstract article info Article history: Received 26 June 2014 Received in revised form 10 September 2014 Accepted 10 September 2014 Available online xxxx Editor: D. Barcelo Keywords: Extreme weather events Road infrastructure Road drainage Hydrological model Runoff Identifying a bestperforming hydrologic model in a practical sense is difcult due to the potential inuences of modeller subjectivity on, for example, calibration procedure and parameter selection. This is especially true for model applications at the event scale where the prevailing catchment conditions can have a strong impact on apparent model performance and suitability. In this study, two lumped models (CoupModel and HBV) and two physically-based distributed models (LISEM and MIKE SHE) were applied to a small catchment upstream of a road in south-eastern Norway. All models were calibrated to a single event representing typical winter conditions in the region and then applied to various other winter events to investigate the potential impact of calibration period and methodology on model performance. Peak ow and event-based hydrographs were simulated differ- ently by all models leading to differences in apparent model performance under this application. In this case- study, the lumped models appeared to be better suited for hydrological events that differed from the calibration event (i.e., events when runoff was generated from rain on non-frozen soils rather than from rain and snowmelt on frozen soil) while the more physical-based approaches appeared better suited during snowmelt and frozen soil conditions more consistent with the event-specic calibration. This was due to the combination of variations in subsurface conditions over the eight events considered, the subsequent ability of the models to represent the impact of the conditions (particularly when subsurface conditions varied greatly from the calibration event), and the different approaches adopted to calibrate the models. These results indicate that hydrologic models may not only need to be selected on a case-by-case basis but also have their performance evaluated on an application-by- application basis since how a model is applied can be equally important as inherent model structure. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Hydrological models are useful tools for investigating how rainfall transforms into runoff. This is particularity useful when hydrological models are considered for practical applications, such as in designing hydraulic structures associated with roads. Very often, however, the Science of the Total Environment 502 (2015) 315329 Corresponding author. Tel.: +46 8790 7377; fax: +46 8790 6857. E-mail addresses: [email protected], [email protected] (Z. Kalantari), [email protected] (S.W. Lyon), [email protected] (P.-E. Jansson), [email protected] (J. Stolte), [email protected] (H.K. French), [email protected] (L. Folkeson), [email protected] (M. Sassner). http://dx.doi.org/10.1016/j.scitotenv.2014.09.030 0048-9697/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Science of the Total Environment journal homepage: www.elsevier.com/locate/scitotenv

Transcript of Modeller subjectivity and calibration impacts on hydrological model applications: An event-based...

Science of the Total Environment 502 (2015) 315–329

Contents lists available at ScienceDirect

Science of the Total Environment

j ourna l homepage: www.e lsev ie r .com/ locate /sc i totenv

Modeller subjectivity and calibration impacts on hydrological modelapplications: An event-based comparison for a road-adjacent catchmentin south-east Norway

Zahra Kalantari a,b,⁎, Steve W. Lyon b, Per-Erik Jansson a, Jannes Stolte c, Helen K. French d,Lennart Folkeson a, Mona Sassner e

a Department of Land and Water Resources, Royal Institute of Technology/KTH, SE-10044 Stockholm, Swedenb Department of Physical Geography and Quaternary Geology, Stockholm University, SE-106 91 Stockholm, Swedenc Norwegian Institute for Agricultural and Environmental Research, Bioforsk, Soil and Environment Division, NO-1432 Ås, Norwayd Department of Plant and Environmental Sciences, Norwegian University of Life Sciences, NO-1432 Ås, Norwaye DHI Sverige AB, SE-111 29 Stockholm, Sweden

H I G H L I G H T S

• We compared 4 hydrological models regarding their capabilities to predict peak flow.• The efficiency of models can vary based on the hydroclimatic conditions.• Modeller subjectivity plays an important role in model performance.• Models used in designing road must represent seasonal hydrological behaviour.• Model calibration is a complicated process that is sensitive to modeller subjectivity.

⁎ Corresponding author. Tel.: +46 8790 7377; fax: +4E-mail addresses: [email protected], zahra.kalantari@natg

[email protected] (S.W. Lyon), [email protected] (P.-E. Jans(J. Stolte), [email protected] (H.K. French), [email protected] (M. Sassner).

http://dx.doi.org/10.1016/j.scitotenv.2014.09.0300048-9697/© 2014 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 26 June 2014Received in revised form 10 September 2014Accepted 10 September 2014Available online xxxx

Editor: D. Barcelo

Keywords:Extreme weather eventsRoad infrastructureRoad drainageHydrological modelRunoff

Identifying a ‘best’ performing hydrologic model in a practical sense is difficult due to the potential influences ofmodeller subjectivity on, for example, calibration procedure and parameter selection. This is especially true formodel applications at the event scale where the prevailing catchment conditions can have a strong impact onapparent model performance and suitability. In this study, two lumped models (CoupModel and HBV) and twophysically-based distributed models (LISEM and MIKE SHE) were applied to a small catchment upstream of aroad in south-easternNorway. Allmodelswere calibrated to a single event representing typicalwinter conditionsin the region and then applied to various other winter events to investigate the potential impact of calibrationperiod andmethodology onmodel performance. Peak flow and event-based hydrographswere simulated differ-ently by all models leading to differences in apparent model performance under this application. In this case-study, the lumpedmodels appeared to be better suited for hydrological events that differed from the calibrationevent (i.e., events when runoff was generated from rain on non-frozen soils rather than from rain and snowmelton frozen soil) while the more physical-based approaches appeared better suited during snowmelt and frozensoil conditionsmore consistentwith the event-specific calibration. This was due to the combination of variationsin subsurface conditions over the eight events considered, the subsequent ability of the models to represent theimpact of the conditions (particularlywhen subsurface conditions varied greatly from the calibration event), andthe different approaches adopted to calibrate the models. These results indicate that hydrologic models may notonly need to be selected on a case-by-case basis but also have their performance evaluated on an application-by-application basis since how a model is applied can be equally important as inherent model structure.

© 2014 Elsevier B.V. All rights reserved.

6 8790 6857.eo.su.se (Z. Kalantari),son), [email protected]@vti.se (L. Folkeson),

1. Introduction

Hydrological models are useful tools for investigating how rainfalltransforms into runoff. This is particularity useful when hydrologicalmodels are considered for practical applications, such as in designinghydraulic structures associated with roads. Very often, however, the

Fig. 1. Land use and main soil types of the Skuterud catchment with a photo of the outlet.

316 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

methods used for designing roads do not utilise state-of-the-sciencehydrological models. To date in Sweden for example, road drainagestructures e.g. culverts and bridges in rural areas have typically been di-mensioned for flows with a return period of 50 years adjusted to achanging climate by a simple static correction factor (Vägverket,2008). However, these 50-year flows are calculated using the rationalmethodwhich represents one of the oldest and simplestmethods in hy-drological engineering applications (Benzvi, 1989; Maidment, 1993).This method, based on statistical methods for estimating rain intensitycurves and constant runoff coefficients, is in fact still quite popularworld-wide owning to its simplicity. The rational method does not,however, have predictive capabilities to represent changes in climateconditions and land use coverages making it of little value in consider-ing future impacts on road systems. This raises questions with regardto this simple method's utility as one of the main anticipated effects ofclimatic change is increased frequency of extreme weather events invarious parts of the world (Green Paper EU, 2007; Schneider et al.,2007).

The current generation of hydrological models can potentiallyprovide a better understanding of how weather events influencecatchment-scale hydrology and peak flows (Jin et al., 2010) helping toimprove road maintenance strategies and future road development(particularly in response to climate change). Independent of themodel-ling approach, the relative importance (or sensitivity) of a model's var-ious parameters depend on the dominant hydrological conditions and

processes in the region being modelled. For example, in cold regionsthe model parameters pertaining to soil freezing and thawing areimportant since infiltration rate can change due to changes in soilhydraulic conductivity, pore-size distribution in soil, and soil struc-ture in frozen and partially frozen soil (Hillel, 1998). As such, the se-quencing of frozen and non-frozen soil conditions, which determinesthe rate of water infiltration into soils (Hayashi et al., 2003), stronglyinfluences the calibration and applicability of hydrological models inthese regions.

Indeed, road design (and many other practical applications) couldclearly benefit fromusing the current generation of hydrologicalmodelsthat have the possibility to include dynamic influences of (and potentialfuture changes to) land use and climate when estimating peak flow.Care needs to be taken by the modeller, however, in exercising thesubjectivity associated with not only selecting an appropriate model(i.e. one capable of representing the relevant processes), but alsoselecting the period/methodology considered for calibration as thelatter potentially could have a large influence on the ‘best’ parameterset or the apparent model performance. For rainfall–runoff modellingof catchments, a wide variety of hydrological models are now availablefor implementation in road planning and construction. Numerous stud-ies have compared the performance of hydrological models (Breueret al., 2009; Clark et al., 2008; Deelstra et al., 2010a,2010b; Gurtz et al.,2003; Hollander et al., 2009; Loague and Vander Kwaak, 2002; Plescaet al., 2012; Reed et al., 2004; Refsgaard and Knudsen, 1996). Due to

Table 1Characteristics of the LISEM, MIKE SHE, CoupModel and HBV models and their capabilities for various hydrological processes (SW = surface water, GW = groundwater,ET = evapotranspiration, PET = potential evapotranspiration).

Process LISEM MIKE SHE CoupModel HBV

Surface waterEvapotranspiration No Yes Yes YesLand usedistribution

Yes Yes No No

Stream flow No Yes No NoOverland flow Yes Yes Yes No

GroundwaterUnsaturated flow Yes Yes Yes YesGroundwater flow No Yes Yes YesTile drainage No Yes Yes No

SW/GW interactionFrozen soil Indirect, by altering the soil

infiltration capacityNo Yes No

Snowmelt Yes Yes Yes YesInfiltration Yes Yes Yes Yes

CalibrationEvent Single period Entire period, split period Entire period, split period Entire period, split periodMethod Subjective single parameter Subjective single parameter Subjective multi parameters Objective multi parameters

DataForcing datarequirements

Meteorological data such as airtemperature, precipitation, PET

Meteorological data such as airtemperature, precipitation,reference ET

Meteorological data such as air temperature,precipitation, explicit dynamic ETrepresentation

Meteorological data such asair temperature, precipitation,PET

Independent inputdata

Landscape and vertical distributedinput data (soil, vegetation,drainage)

Landscape and vertical distributedinput data (soil, vegetation.,drainage)

Vertical distributed input data (soil,vegetation., drainage)

Box-like design (soil data)

317Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

differences in local hydrology determined by climatic, geological andsurface conditions and through differences in modeller expertisedetermined by application, models may vary in their applicability in agiven region, timescale, and setting. This is especially true in regionswhere there are strong seasonal variations in catchment-scale hydro-logical response.

Previous work, for example, by Deelstra et al. (2010a,b), compareddifferent types of hydrological models (DrainMod, SWAT, HBV, COUPand INCA models) for an area in south-east Norway and found goodagreement between measured and simulated values when integratingover longer periods of time (more than a week). It remains to beseen how such hydrological models perform at the shorter timescales(i.e. event scales) such as those required to estimate peak dischargesfor designing road systems. Furthermore, before these hydrologicalmodels can be used at event scale, the relative (and potentiallyconfounding) role of calibration period and modeller subjectivity onmodel performance under case-by-case specific applications must beconsidered. As such, the overall aims of the present study were to eval-uate the performance of a selection of hydrological models for simulat-ing discharge dynamics from an area upstream of a road on an hourlybasis and to examine the impact of calibration approach and period onmodel performance at event scales. By applying four different hydrolog-ical models to estimate rainfall–runoff responses under varying winterconditions (i.e., frozen to unfrozen soils, snow to no snow cover) anddiffering calibration procedures, we explore the potential role ofmodel-ler subjectivity onmodel performance based on the conditions of appli-cation and model structure.

2. Materials and methods

2.1. Study area

The Skuterud catchment near Ås, approximately 30 km south eastof Oslo, Norway (Fig. 1) was selected as the study area. It is a well-

documented catchment in which various hydrological models havebeen tested and applied previously (e.g., Deelstra et al., 2010a,2010b).Furthermore, the surrounding region is of general interest to the Nor-wegian Public Road Administration (Statens vegvesen) owning to itshydrological variability and associated impact on road design, construc-tion and maintenance. The total area of the catchment is about 4.5 km2

and themain land use is agriculture (mostly grain, potato and ley crops)covering about 60% of the total area. Field drains installed at about 10mspacing and 80 cm depth are commonly used in agricultural areas ofthe catchment to remove excess soil water. The remaining land covercomprises forest (30% of total area) and urban areas (10% of totalarea). The main soil type is silty clay loamwith some sand and morainedeposits (Deelstra et al., 2005), while loams and sandy loams occur inthe transition between historic marine and shore deposits (Kværnøet al., 2007).

The mean annual temperature at Ås is 5.3 °C, with a minimummeasured mean monthly temperature of −4.8 °C in January/Februaryand a maximum of 16.1 °C in July. Mean annual precipitation forthe period 1960–1990 (which is used as the reference period formost climate-based research in Norway) is 785 mm, with a minimummeasured monthly mean of 35 mm in February and a maximum of100 mm in October. Mean annual potential evapotranspiration(PET) in the region is about 535 mm/y (Deelstra et al., 2010a,2010b; Thue-Hansen and Grimenes, 2009).

2.2. Hydrological models

Numerous models are available for modelling discharge and hydro-logical processes in small catchments such as Skuterud. Further, hydro-logical models can be applied with different calibration procedures andunder different conditions requiring vastly different types of data forparameter identification. These differences make direct comparison ofmodelling performances difficult and, along with the sheer magnitudeof models available, makes it impossible to apply all models to

Table 2LISEM, MIKE SHE, CoupModel and HBV model parameters. DTM = digital terrain model. n.a.: not available.

Model parameter LISEM MIKE SHE

Method Value Method Value

Overland flow Surface roughness Kinematic wave model Spatial distributed (gridded data),Manning's n, between 0.2 and 0.4

2D finite difference-diffusive wave Spatial distribution, roughnesscoefficient (M) between 5 to 6

Slope parameters Spatial distribution (gridded data),topographical data from DTM map

Spatial distribution, topographical datafrom DTM map

River flow River bed roughness Kinematic wave model Manning's n value 0.5 1D St. Venant equation Roughness coefficient, M = 30River bed section Estimated from topography data

Flow parameters (unsaturatedand saturated zone)

Saturated hydraulic conductivity (Ks) 1D finite difference solutionof the Richards equation(no horizontal flow)

Ks: 1e−005, 3.4e−005, 3e−007,7.6e−007 m/s

1D finite difference, Richardsequation in unsaturated zone (UZ).3D finite difference — Darcy flowin saturated zone (SZ)

Ks = 4.2e−005 to 1e−007 m/s

Water content at saturation (porosity) 0.4, 0.4, 0.38, 0.47 m3/m3 0.38, 0.47, 0.58, 0.6, 0.7 m3/m3

Water content at field capacity 0.36, 0.37, 0.37, 0.38 m3/m3 0.3,0.3, 0.3, 0.48, 0.6 m3/m3

Water content at wilting point 0.22, 0.23, 0.25, 0.27 m3/m3 0.2, 0.2, 0.16, 0.26, 0.2 m3/m3

Number of the layers 10 UZ: 4SZ: 3

Layer thickness Vary between 0.01 m and0.3 m and, thus, is 1 m in total

UZ: 0.1 m, 0.1 m, 0.3 m, 0.5 m and,thus, is 1 m in totalSZ: 2 m, 18 m, 80 m and, thus, is100 m in total

Horizontal conductivity (Kh)Vertical conductivity (Kv)

– Kh = 1e−006 to 1e−008 m/sKv = 1e−006 to 1e−008 m/s

Specific yield – First layer: 0.04 to 0.1Second and third layer: 0.0001

Specific storage – First layer: 0.001 to 0.006 (1/m)Second and third layer: 1e−006 (1/m)

Actual evapotranspiration Leaf area index, LAI Leaf area index is used forinterception calculations

Coniferous forest: 2.4Arable land: 3.1Swamp: 5.9

MIKE SHE uses the Kristensen andJensen method for calculatingactual evapotranspirationbased on reference evaporation,leaf area index, root depth for eachvegetation type, and a set ofempirical parameters

Coniferous forest: 4–6.5Arable land: 1–6Swamp: 4–6

Root depth, RD Coniferous forest: 0.45 mArable land: 0.1–1 mSwamp: 1 m

Coniferous forest: 0.45 mArable land: 0.1–1 mSwamp: 1 m

Crop coefficient, Kc Coniferous forest: 1–1.3Arable land: 1–1.3

318Z.Kalantarietal./Science

oftheTotalEnvironm

ent502

(2015)315

–329

Swamp: 1Drainage option Drain level n.a. Empirical formula −0.8 m relative to the ground

Drainage time constants 5.5e−007 s−1

Drain spacing Drainage routed downhill based onadjacent drain level

Snow pack Threshold melting temperature n.a. Degree-day method 0 °C

Model parameter CoupModel HBV

Method Value Method Value

Overland flow Surface roughness One linear reservoir equation SurfCoef = 1.92SurfPoolMax = 0

Three linear reservoir equations n.a. (a simplified version of HBVcoupled with CoupModel usedin this study)

Slope parameters

River flow River bed roughnessRiver bed section

n.a. – A triangular weighting function –

Flow parameters Saturated hydraulic conductivity (Ks) Richards equation combinedwith water uptake anddrainage sink function abovegroundwater level. Onlydrainage equations belowgroundwater level

Ks = 2.5e−005, 4.5e−005 m/s Functions of actual water storagein a soil box, single pathway withsingle regulation

Critical uptake frac = 4.5e−004 mField capacity = 0.05 mInitial base storage = 0.03 mInitial peak storage = 0.03 mInitial soil storage = 0.03 m

Water content at saturation (porosity) 0.35–0.46Water content at field capacity –

Water content at wilting point 0.425Number of layers 10Layer thickness Vary between 0.01 m and 0.6 m

and, thus, is 1 m in totalHorizontal conductivity (Kh)Vertical conductivity (Kv)

Specific yield –

Specific storage –

Actual evapotranspiration Leaf area index Different pathways withdifferent regulations

Arable land: 1–6 Functions of actual water storagein a soil box, single pathway withsingle regulation

Calculated from potential ET estimatedby Penman formula inCoupModel

Root depth 0.1–1 m

Drainage option Drain level Ernst model physical-basedaccounting methods modifiedfor a vertical soil profile.

−0.8 m – –

Drain time constants –

Drain spacing 10 mSnow pack Threshold melting temperature Energy balance equation,

including surface heatexchange, radiation andnear-surface soil heat flux

0.8 °C Degree-day method (in thisstudy calculated by CoupModel)

0.8 °C

319Z.Kalantarietal./Science

oftheTotalEnvironm

ent502

(2015)315

–329

320 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

determine which is ‘best’ in any real sense. As such, our goal here is notto identify a best model but rather we considered differences in modelapplications from a cross-section of four specific models differing instructure, calibration procedure, and input requirements. We seek toidentify the relative role of calibration period and approach (modellersubjectivity) on model performance at the event scale. These fourmodels (LISEM, MIKE SHE, CoupModel and HBV) have been widely ap-plied in the region and the scientific literature (Jetten and De Roo, 2001;Refsgaard and Storm, 1995; Jansson and Karlberg, 2004; Bergström,1976, 1992). They have also been identified as models for potentialuse in designing road hydraulic structures under current and futureclimate scenarios by Norwegian Public Road Administration (Statensvegvesen, 2011). They span from more physically-based, distributedmodels (LISEM and MIKE-SHE) to more lumped models (CoupModeland HBV) and have been set up using different calibration approaches.Each model is briefly described in Section 1 in the Supplementarymaterial and their main features are summarised in Table 1. For a fulldescription of the model, please see the relevant literature.

2.3. Input data

2.3.1. Model parameter dataThe input data and pre-processing required differ between the four

models (Table 1). Most relevant input data used here for the Skuterudcatchment were taken from Deelstra et al. (2005) while the initial un-saturated conductivity values for the soils were taken from Kværnøand Deelstra (2003). These data serve as initial estimates for parametervalues (Table 2) that were subsequently calibrated for a given runoffevent (see next section). Note that, because of variations in previousapplications of these models and differences in model parameterdefinitions, the initial parameter values can vary between the differentmodelling approaches. For further description of data and processrepresentations used in these models, please see Section 2 in theSupplementary material.

2.3.2. Rainfall–runoff event dataThe rainfall data used were recorded historical maximum short-

term precipitation data taken from the NorwegianMeteorological Insti-tute database. The discharge data were measured using a submergedweir of triangular profile (Crump, 1952) installed at the catchment out-let. The water level was recorded by a pressure transducer combinedwith a Campbell data logger and the discharge computed every 5 minbased on a head-discharge relationship specific to the weir (Deelstraand Iital, 2008). Using data from the Skuterud catchment, it waspossible to analyse winter runoff during three distinct periods contain-ing a total of eightmain rainfall–runoff events (Table 3). Period I (10–18January 2008) represented hydrological conditions with rain or snow-melt (or both) on partially frozen soil. This period thus accounted forthe presence of snow on the ground and its melting and subsequent

Table 3List of events used to compute the event-based statistics in the four models used.

Periods Event no. Date Hydrological condition

Precipitation Soil Snow cover

Calibration eventPeriod I 1 10–11 Jan 2008 Rain Partially frozen Yes

Validation events2 13–18 Jan 2008 Rain Partially frozen Yes

Period II 3 22–23 Nov 2007 Rain Frozen Partial4 24–25 Nov 2007 Rain Frozen Partial5 29–30 Nov 2007 Rain Frozen Partial6 01–02 Dec 2007 Rain Frozen Partial

Period III 7 05–06 Nov 2007 Rain Non-frozen No8 08–09 Nov 2007 Rain Non-frozen No

runoff and infiltration of water into the soil. The period included twomain events, a single event on 10 January and extended, compoundedevent from 13–18 January 2008. Period II (20 November–10 December2007) represented frozen soil and rain event conditions. This period in-cluded four main events, on 22, 25 and 29 November and 1 December2007. Period III (2–12 November 2007) represented non-frozen soiland rain event conditions and consisted of two main events, on 5 and8 November 2007.

2.3.3. Initial conditionsWith regard to the initial conditions used for event-based calibration

(see next section), these also differed slightly between the applicationsof the four models in this case study because of the inherent nature ofthe models. In the LISEMmodel (single-event model), the initial condi-tions (e.g., soil moisture) are defined by the user and in this currentstudy were based on previous studies (e.g., Kværnø and Stolte, 2012).This requires a great deal of knowledge on the part of the user andcan affect model output significantly. In applications of MIKE SHE,CoupModel and HBV (continuous models), hydrological simulationsare commonly initialised by running the model repeatedly over agiven period until an equilibrium state is achieved such that this ‘spin-up’ period lowers the influence of biassed initial conditions (Doe andHarmon, 2001). Such a spin-up approach was adopted here for theapplication of MIKE SHE, CoupModel and HBV with each model runfor a looping 1-year period (i.e., the year that contained the eventbeing modelled) to determine the initial conditions.

2.4. Event-based calibration

In addition to the base model parameters adopted from availabledata and previous model applications at Skuterud, all four modelswere calibrated specifically to the event on 10 January 2008. Thisevent was selected as it represents a common winter condition eventfor this region where, after a period of frost, there was a partially frozentop layer of the soil (Fig. 2). This gave rise to a quick and large responseto a single rainfall on snow event that can be considered to typifywinterrainfall–runoff responses at Skuterud catchment (Deelstra et al., 2009).Owning to the different setups and structures (and ideologies) of thewide-breadth of models being applied, different event-specific calibra-tion strategies were employed for each model. For further descriptionofmodel calibration strategies, please see Section 3 in the Supplementa-ry material.

3. Results

3.1. Period I

Period I included snowmelt, partially frozen soil and rain event con-ditions. The event on 10 January 2008was used to calibrate fourmodels

Hydrograph

Measured peakflow (m3/s)

Maximum precipitationintensity (mm/h)

LISEM MIKE SHE CoupModel HBV

Q = 2 P = 8 Yes Yes Yes Yes

Q = 3 P = 15 Yes Yes YesQ = 0.75 P = 4.5 No Yes Yes YesQ = 0.8 P = 3 No Yes Yes YesQ = 0.9 P = 7 No Yes Yes YesQ = 2.5 P = 10 No Yes Yes YesQ = 0.6 P = 4.5 No Yes Yes YesQ = 0.7 P = 5 No Yes Yes Yes

Event 1 (calibration) Event 2 (validation)

Fig. 2. Period I: Simulated discharge forMIKE SHE (blue line), median value CoupModel (grey line), CoupModel uncertainty band (grey band) shows the range between theminimumandmaximumdischarge based on 614 accepted runs for CoupModel.Median value HBV (orange line), HBV uncertainty band (yellow band) shows the range between theminimumandmax-imumdischarge basedon244 accepted runs for HBV. LISEM(green line) andmeasured discharge (blackdashed line). Theevent 1 on10 Januarywasused for calibration andevent 2 during13–18 January for validation. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

321Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

while the compounded event on 13–18 January provides validation ofthe models' performances for an event occurring under similar condi-tions (Table 3). The event on 10 January had a total rainfall of 13 mmwith a maximum intensity of about 8 mm/h. The rainfall duration forthis event was 5 h. The compounded event on 13–18 January had

80.6 mm total rainfall. This event, with a maximum intensity of about15 mm/h and duration of 11 h, was equal to a 2-year storm accordingto the Norwegian Meteorological Institute. So, clearly, even though theconditions in the catchment were similar, the rainfall events wererather different leading to different runoff responses. The snow water

Table 4Statistics for the four models (peak error (m3/s) = Q peak (simulated) − Q peak (observed), R2 and NSE = Nash–Sutcliffe simulation efficiency before calibration of discharge duringeach event for each model). There are no statistics available for LISEM before the calibration because it could not produce discharge without prior calibration. *The event on 13–18 Jan2008 includes five continuous discharge peaks in the hydrograph. The value here is the average of the residual from all the peaks. The R2 and NSE values were obtained for a simulationinMIKE SHE. These valueswere compared with themedian,minimum andmaximumvalues of R2 and NSE for all runs in CoupModel and HBV. The average statistics for the three periodsare made for the median values of CoupModel and HBV.

Periods Event number Date Statistics Before calibration

LISEM MIKE SHE CoupModel HBV

Calibration eventPeriod I 1 10–11 Jan 2008 Peak error – −0.6 −0.91 −0.87

R2 – 0.1 0.6 (0–0.8) 0.43 (0–0.82)NSE – −0.52 0.27 (−1.38–0.69) −0.04 (−1.48–0.71)

Validation events2 13–18 Jan 2008* Peak error – −0.14 −0.35 0.27

R2 – 0.62 0.76 (0.38–0.85) 0.73 (0.37–0.86)NSE – 0.13 0.51 (−1.78–0.87) 0.36 (−1.89–0.86)

Period II 3 22–23 Nov 2007 Peak error – 0.43 0.72 0.54R2 – 0.02 0.47 (0–0.82) 0.64 (0.17–0.87)NSE – −2.25 −6.41 (−16.98–0.55) −0.36 (−12.8–0.86)

4 24–25 Nov 2007 Peak error – 0.97 0.36 0.16R2 – 0.02 0.55 (0.21–0.76) 0.63 (0.13–0.8)NSE – −9.01 −7.58 (−25.73–0.6) −1.48 (−25.1–0.73)

5 29–30 Nov 2007 Peak error – 0.07 −0.19 −0.41R2 – 0.27 0.4 (0–0.97) 0.5 (50–0.98)NSE – −0.47 −0.05 (−8.16–0.44) −0.42 (−6.93–0.80)

6 01–02 Dec 2007 Peak error – 0.83 0.22 1.39R2 – 0.59 0.6 (0–0.97) 0.45 (0–0.96)NSE – 0.42 0.01 (−1.65–0.86) 0.02 (−2.22–0.95)

Period III 7 05–06 Nov 2007 Peak error – 0.18 0.22 −0.1R2 – 0.06 0.51 (0–0.91) 0.65 (0.31–0.78)NSE – −0.51 −0.83 (−17.18–0.89) −0.12 (−10.5–0.729)

8 08–09 Nov 2007 Peak error – 0.33 0.26 0.08R2 – 0.23 0.7 (0.16–0.94) 0.68 (0.17–0.91)NSE – −0.16 −0.43 (−11.96–0.87) −0.02 (−8.99–0.89)

Average of statistics in Period I Peak error – −0.4 −0.6 −0.3R2 – 0.4 0.7 0.5NSE – −0.2 0.4 0.7

Average of statistics in Period II Peak error – 0.6 0.3 0.4R2 – 0.2 0.5 0.6NSE – −2.8 −3.5 −0.6

Average of statistics in Period III Peak error – 0.3 0.2 0.0R2 – 0.1 0.6 0.7NSE – −0.3 −0.6 −0.1

322 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

equivalent before the event was 2.4 cm based on a measurement in thestudy area on January 8. The soil temperature on 10 January was−0.5 °C, −0.3 °C and 0.0 °C at 10, 20 and 40 cm depth, respectivelymeasured at meteorological station of the Department of AgriculturalEngineering (ITF) of the UMB (Norwegian University of Life Sciences)which is located within the catchment. The duration of soil frost andthe depth towhich the soil was actually frozen are associatedwith uncer-tainty as it is typically related to the variability in the depth and durationof snow and vegetation cover and to soil properties. This resulted in par-tial frozen soil conditions across the catchment during Period I.

LISEM andMIKE SHEproduced good estimates of the peak dischargefor the 10 January event for which they were calibrated. However, theyfailed to reproduce the shape of the recession following the event.In contrast to this, HBV and CoupModel successfully described thegeneral shape and the total volume during the calibration event;but did not succeed in simulating the peak discharge for whichthey were calibrated (Fig. 2). The results for the compound eventon 13–18 January indicated that the four models differed in theirprediction of the dynamics with respect to timing and intensity(Fig. 2). The measured discharge on 14 January was not predicted ac-curately by any of the models, although both CoupModel and HBVsimulated an increased runoff compared with the measured data.The measured peak discharge on 15 January was not predicted byLISEM and the last measured peak discharge, on 17 January, wasoverestimated by both LISEM and MIKE SHE.

In order to see the impact of calibration on themodels' performance,NSE, R2 and peak error were calculated before and after calibration

(Tables 4 and 5, respectively). Inter-model comparison of the modelperformance under these specific applications in Period I showed thatthe lumped models appear to have better R2 and NSE values than thedistributed models (Table 5). Calculations of the percentage change inpeak error, R2 and NSE before and after calibration of discharge foreach model indicated that the errors introduced by CoupModelwere about the same as those introduced by HBV (Table 6). The per-centage change in MIKE SHE discharge error showed a larger value,and hence a larger effect of calibration in improving the model(Table 6). LISEM could not produce any discharge before calibrationof the model.

3.2. Period II

Period II included frozen soil and rain event conditions. This periodprovides validation of the models' performances for four events (22,25, 29 November and 1 December 2007) occurring under rather similarconditions as in the calibration event (Table 3). These events had amaximum rainfall intensity of between 3 and 10 mm/h (Table 3). Thesimulated soil temperature (unfortunately there were no measureddata on soil temperature during periods II and III) from CoupModelwere used to analyse possible runoff generation mechanisms based ona range of soil temperature and soil frost conditions related to varioushydrological responses (Figs. 3 and 4). The soil at shallow depth(5 cm)was completely frozen according to themodelwhen the air tem-perature also decreased to below −5 °C. Under these conditions, the

Table 5Statistics for the fourmodels (peak error (m3/s) = Q peak (simulated) − Qpeak (observed), R2 and NSE = Nash–Sutcliffe simulation efficiency after calibration of discharge during eachevent for eachmodel). There are no statistics available for LISEM for periods II and III. *The event on 13–18 Jan 2008 includes five continuous discharge peaks in the hydrograph. The valuehere is the average of the residual from all the peaks. The R2 andNSE valueswere obtained for the best selected simulations in LISEM andMIKE SHE. These valueswere comparedwith themedian, minimum andmaximum values of R2 and NSE for all runs in CoupModel and HBV. The average statistics for the three periods are made for themedian values of CoupModel andHBV.

Periods Event number Date Statistics After calibration

LISEM MIKE SHE CoupModel HBV

Calibration eventPeriod I 1 10–11 Jan 2008 Peak error 0.2 0.49 −0.64 −0.56

R2 0.69 0.6 0.79 (0.79–0.8) 0.8 (0.79–0.82)NSE 0.37 0.28 0.61 (0.6–0.64) 0.61 (0.6–0.67)

Validation events2 13–18 Jan 2008* Peak error −0.03 0.04 −0.28 −0.22

R2 0.87 0.79 0.79 (0.79–0.80) 0.78 (0.74–0.86)NSE 0.31 0.16 0.62 (0.60–0.63) 0.4 (0.07–0.82)

Period II 3 22–23 Nov 2007 Peak error – 0.52 1.15 0.54R2 – 0.65 0.39 (0.26–0.43) 0.61 (0.43–0.82)NSE – −0.52 −3.73 (−13.3 to −0.88) −2.31 (−7.03–0.72)

4 24–25 Nov 2007 Peak error – 0.73 0.36 0.16R2 – 0.7 0.55 (0.49–0.64) 0.65 (0.51–0.72)NSE – −3.5 −3.60 (−14.2 to −1.07) −2.36 (−13.0–0.64)

5 29–30 Nov 2007 Peak error – 0.08 −0.15 −0.41R2 – 0.72 0.17 (0.01–0.33) 0.49 (0.06–0.97)NSE – −0.49 −0.01 (−0.37–0.32) 0.17 (−1.07–0.70)

6 01–02 Dec 2007 Peak error – 0.52 0.8 1.39R2 – 0.94 0.79 (0.74–0.84) 0.83 (0.71–0.89)NSE – 0.76 0.11 (−0.32–0.5) 0.06 (−0.62–0.88)

Period III 7 05–06 Nov 2007 Peak error – 0.62 0.22 −0.06R2 – 0.49 0.45 (0.01–0.87) 0.66 (0.56–0.77)NSE – −0.16 −0.69 (−13.17–0.80) −0.11 (−3.66–0.67)

8 08–09 Nov 2007 Peak error – 0.69 0.26 0.08R2 – 0.64 0.7 (0.45–0.92) 0.84 (0.76–0.91)NSE – 0.13 −0.34 (−7.73–0.64) 0.29 (−4.23–0.88)

Average of statistics in Period I Peak error 0.1 0.3 −0.5 −0.4R2 0.8 0.6 0.8 0.8NSE 0.3 0.2 0.6 0.6

Average of statistics in Period II Peak error – 0.4 0.5 0.4R2 – 0.8 0.5 0.6NSE – −0.9 −1.8 −1.1

Average of statistics in Period III Peak error – 0.7 0.2 0.0R2 – 0.6 0.6 0.8NSE – 0.0 −0.5 0.1

323Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

snow cover was limited, which also made the soil sensitive to thawingwhen the air temperature increased.

The simulation results from LISEM were only available for Period Ibecause discharge for the other periods was most likely caused by sub-surface drainage water, a process which is currently not included in theLISEM model. As such, this event based model could not be configuredto simulate discharge during Period II. The predicted discharge fromthe other three models where discharge could be modelled (i.e., MIKESHE, CoupModel andHBV) is illustrated in Fig. 3. All models substantial-ly overestimated the measured intensity of the peaks, but the overesti-mation by CoupModel and HBV was more substantial than that byMIKE SHE. The overall performance of the MIKE SHE model was betterin the validation events (22, 25, 29 November and 1 December 2007)when hydrological conditions were similar to the calibration event(Table 5).

3.3. Period III

Period III included non-frozen soil and rain event conditions. Thisperiod provides validation of the models' performances for two rain-fall–runoff events on 5 and 8 November 2007. These rainfall eventshad amaximum rainfall intensity of about 4.5 and5 mm/h, respectively.Both the catchment conditions and the rainfall events' intensity were,thus, rather different compared to the calibration event (Table 3).

The results from both events showed that the three models (again,MIKE SHE, CoupModel and HBV) predicted the peak flow and dischargedynamics differently (Fig. 4). The results indicated that the lumped

models (HBV and CoupModel) were apparently more robust and stablewith regard to predictions during this period when runoff was generat-ed from rain on non-frozen soils rather than from rain and snowmelt onfrozen soil (Table 5).

4. Discussion and concluding remarks

4.1. Hydrological models as learning tools

The previous study by Deelstra et al. (2010a) in the Skuterud catch-ment showed how different types of hydrological models performedwhen the results were integrated over longer periods of time (morethan one week). This study, however, focused on howmodels with dif-ferent structures and calibration procedures performed at event-basedtimescales which may be more relevant in the practical context of esti-mating peak discharges and designing road systems. Clearly the differ-ences in model setups and application cannot allow for identificationof truly best performing model; however, these differences can helpwith regard to learningmore about the relative dominance of hydrolog-ical processes in this region, how these potential vary between events(even in the same season), and/or role of model calibrations for event-based model applications in an operational sense.

Specifically, themodels used in the present study allow for examina-tion of the seasonal processes concerning winter-related hydrologicalbehaviour in more detail. During the spring snowmelt period (PeriodI), a large amount of water became available during rainfall-on-snowevents. As the soil infiltration capacity was limited by the distribution

Table 6Percentage change in peak error, R2 and NSE before and after calibration of discharge during each event for MIKE SHE, CoupModel and HBV. There are no statistics available for LISEMbecause it could not produce discharge before calibration.

Period Event number Date Statistics Percentage change

LISEM MIKE SHE CoupModel HBV

Calibration eventPeriod I 1 10–11 Jan 2008 Peak error – 2.2 0.4 0.6

R2 – 0.8 0.2 0.5NSE – 2.9 0.6 1.1

Validation events2 13–18 Jan 2008 Peak error – 4.2 0.2 −2.3

R2 – 0.3 0.0 0.1NSE – 0.2 0.2 0.1

Period II 3 22–23 Nov 2007 Peak error – −0.3 0.4 0.0R2 – 1 −0.2 0.0NSE – 3.3 0.4 −0.8

4 24–25 Nov 2007 Peak error – −0.3 0.7 0.0R2 – 1 0.0 0.0NSE – 1.6 1.1 −0.4

5 29–30 Nov 2007 Peak error – 0.1 0.3 0.0R2 – 0.6 −1.4 −0.1NSE – 0.0 4.0 3.5

6 01–02 Dec 2007 Peak error – −0.6 0.7 0.0R2 – 0.4 0.2 0.5NSE – 0.4 0.9 0.7

Period III 7 05–06 Nov 2007 Peak error – 0.7 0.0 0.7R2 – 0.9 −0.1 0.0NSE – 2.2 0.2 0.1

8 08–09 Nov 2007 Peak error – 0.5 0.0 0.0R2 – 0.6 0.0 0.2NSE – 2.2 0.3 1.1

Average of percentage change in Period I Peak error – 3.4 0.3 −0.8R2 – 0.8 0.5 0.3NSE – 1.8 0.7 0.6

Average of percentage change in Period II Peak error – −0.3 0.3 0.0R2 – 0.7 −0.3 0.1NSE – 1.3 1.7 0.7

Average of percentage change in Period III Peak error – 0.6 0.0 0.3R2 – 0.8 −0.1 0.1NSE – 2.2 0.2 0.6

324 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

of frozen soil, this translated into a large and relatively quick stream re-sponse (Fig. 2). Similar hydrological responses have been observedacross Scandinavia, from boreal (e.g. Lyon et al., 2012) to more alpinesettings (e.g. Dahlke et al., 2012) under mid-to-late winter conditions.Because of differences in ability to represent specific processes (subsur-face frozen soils) and model structures (distributed versus lumpedsetups), CoupModel and HBV failed to simulate peak discharge in thecalibration event, while LISEM and MIKE SHE showed good estimationof peak discharge but failed to capture the true recession limb (Fig. 2).This carried over to the compound event used for validation in PeriodI and the events in Period II. The models at the more distributed,physically-based end of the spectrum (MIKE SHE and LISEM) appearedcapable of replicating the quick response of rain on snowmelt events,while the more lumped models (CoupModel and HBV) were able tomatch runoff peaks but simulated a more dampened hydrological re-sponse (i.e. high baseflow between events).

These general responses at the event scale can only be partly attrib-uted to process representation within the various models. There is aconfounding impact of the calibration procedure adopted for eachevent-specific model application. Hydraulic conductivity and hencethe infiltration capacity of the topsoil are strongly influenced by thetotal water content and the relative distribution of liquid and frozenwater in soil pores (Granger et al., 1984; Johnsson and Lundin, 1991).For example, CoupModel reduces the infiltration rate based on soil tem-perature, ice content and soil properties, whereas MIKE SHE does notchange the infiltration capacity even when the soil is frozen. The influ-ence of these subsurface frozen properties carried over to the resultsfor Period II, which was clearly dominated by rain on snow with frozensubsurface conditions (Fig. 3). This can be explained by the fact that the

model parameters used were manually calibrated during Period I. Assuch, modeller subjectivity plays some role in model performance.Under the event-specific applications considered here, the physicallybased models, which relied more on manual calibration to assignparameter values and thus inherently incorporatedmoremodeller sub-jectivity, performed better at estimating peak flow rates and timingthan the more lumped approaches during Period II as it is similar toPeriod I. However in Period III, the lumped models performed betterin estimating runoff peaks under conditions that were not necessarilysimilar to those of the calibration period. In fact, the physically based,distributed models actually tended to perform worse after calibrationfor the events in Period III.

This apparent flexibility in the lumped model performance outsidethe conditions relevant for the calibration event is at least partly dueto methodological differences to how these models were calibrated.Since HBV and CoupModel were calibrated such that best performingparameter sets were retained (rather than one optimal set), thesemodelsweremore resilient for representing the rainfall–runoff process-es thatwere likely active in Period III. For example, HBV and CoupModelsimulated higher evapotranspiration thanMIKE SHEduring Period III. InHBV, evaporation from rainfall was computed in the soil routine as afunction of actual water storage andmaximum soilmoisture storage ca-pacity (e.g., the field capacity parameter). Field capacity was one of themost sensitive parameters calibrated by the model, with higher valuesreflecting greater soil water storage capacity and therefore largerwater availability for evaporation. In CoupModel, evapotranspirationwas determined from intercepted water, transpiration with a canopyand soil. Therewere a number of parameters representing soil evapora-tion calibrated by the model (e.g. the lower soil surface resistance and

Event 3 Event 4 Event 5 Event 6

325Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

326 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

vapour pressure at the soil surface, the higher evaporation). The evapo-transpiration, on the other hand, in MIKE SHE was modelled to bealmost zero in all three periods analysed since the influence of evapo-transpiration on runoff was almost negligible during winter months atthe event scale.

4.2. Evaluation of models for road systems and other practical applications

In order to simulate particular hydrological behaviour ofcatchments near road structures, an appropriate model structure,identifiability of parameter values and minimisation of model ana-lytical uncertainty are vital (Son and Sivapalan, 2007). Isolatingeach of these vital aspects across models is rather difficult. For ex-ample, to enable direct comparison between various models similarcalibration methodologies would need to be implemented with re-gard to the data used, objective functions, parameter optimizationtechnique, and selection of parameters to be calibrated. This is a lim-itation of the current study where we have adopted different ap-proaches for the four models ranging from only one parametercalibrated manually for MIKE SHE to 17 parameters calibrated by anumerical routine for HBV.

Model calibration is a complicated process that is sensitive tomodeller subjectivity. Overall model performance in an operationalsense can clearly depend on who applies a given model and how itwas calibrated. The differences in calibration approaches and (con-sequently) the differences in parameter values specified by themodellers in this current study make it difficult, thus, to assesswhether differences in simulation results (model performance)are due to differences in model structures or simply artefacts dueto different calibration approaches. This limitation is commonacross research hydrology where different models are applied bydifferent modellers in similar regions and creates a gap betweenmodelling research and direct application (e.g., Agnew et al.,2006). This highlights the general difficulty associated with com-parison of hydrological model performance under a specific applica-tion. While certain model aspects (e.g., a physical basis versus aconceptual basis) may be a priori identified as attractive, once amodel is selected and applied to a given catchment, confounding in-fluences (e.g., differences of calibration procedure or modeller sub-jectivity) make determining a truly best model difficult.

Further, as shown in this current study, it clearly cannot be con-cluded that a certain type of model performs better than others,since model performance can also vary based on the hydroclimaticconditions under which the models are applied. Our findings withregard to the relative rankings of individual model performance forthis cold region may not necessarily hold in hydroclimatic regionswhere different processes dominate. This becomes crucial whenconsidering applications, particularly those involving managementdecisions, of models originally developed for regions with very dis-tinct hydroclimatological settings (Steenhuis et al., 2009). As such,spatiotemporal variations in the underlying dominant hydrologicprocesses must be considered particularly when establishing man-agement tools for exploring scenarios of development (e.g., Lyonet al., 2014). To truly assess a best performing hydrological modelat a given location, many similar modelling experiments comparingacross different calibration schemes under different catchment con-ditions and by different modellers would be required. The process ofassessing the performance of a hydrological model thus requiresboth subjective and objective comparisons of not only the simulatedand observed values (Dawson et al., 2007) but also the modelling

Fig. 3. Period II: Simulated discharge for MIKE SHE (blue line), median value CoupModel (greycertainty band (yellow band). Measured discharge: black dashed line. (For interpretation of thearticle.)

application situation (i.e., the modeller, the conditions, the calibra-tion approach).

When selecting a hydrological modelling tool for estimating event-scale hydrologic responses to inform, for example, road structuredesign, the type of flood to be estimated also needs to be considered.The current study demonstrates that hydrological models should beselected on a case-by-case basis and performance evaluated on anapplication-by-application basis since how a model is applied is just asimportant as themodel selection itself. For example, themanual calibra-tion process, which was applied in LISEM and MIKE SHE, mainlydepends on the modeller's skill and familiarity with both the modelstructure and the catchment being considered. In general, it is difficultto identify a clear point indicating the end of the calibration processand hence different results can be obtained by different modellers(Wheater, 2002). The time-consuming nature of manual calibrationis another potential problemwith this type of calibration. These difficul-ties can be somewhat by-passed through automatic calibration proce-dures, such as those considered in this current study for application ofthe HBV and CoupModel based on Monte Carlo simulation (Beven,2000) together with a multi-objective criterion. This may also allowfor an objective strategy to estimate parameters eliminating the subjec-tive effects involved in the manual approach (Boyle et al., 2000). Still,automatic calibration methods have not yet developed to the point toentirely replace manual methods such that automatic calibration isoften effective when used combined with a manual procedure(e.g., Sorooshian and Gupta, 1995).

Such application-by-application considerations may in part behindered by the inherent requirements of given model structures.For ungauged basins with no real-time monitoring of discharge, forexample, distributedmodels with more physical basis might be diffi-cult to generalise and apply. In the lumped approach models, the pa-rameters pertaining to different processes associated with waterbalance were selected for calibration by specification of a uniformrange for each parameter. This explains why these models wereable to perform better in Period III in the current case study, asthey were flexible because the parameters had been defined withinwide ranges. However, many of these parameter sets were essential-ly non-behavioural during calibration period. As such, to some ex-tent, calibration process was able to compensate for a lack ofparameter value identifiability. This needs to be explored furtherby applying the various models under different seasonal conditions(e.g., summer floods versus winter floods). In addition, more consis-tent calibration methods, for example using the same optimizationstrategy across all models, needs to be better investigated before aspecific model can be recommended for designing road structuresin this region.

From the practical application perspective of road design and main-tenance in a regionwhere seasonality is amajor factor, the accuracy of aparticular model thus depends on a combination of (1) its structure(distributed versus lumped), (2) its ability to represent specific process-es (e.g. subsurface frozen soils), (3) the availability of data, and (4)mod-eller subjectivity (e.g., calibration procedure). In addition, as extremeweather events resulting in highflows under the current climate regimecan cause considerable damage to transport infrastructure (Kalantariand Folkeson, 2013), models applied to simulate total runoff when de-signing road drainage structures must be able to represent seasonal hy-drological behaviour. Development of a process-based, self-adaptivesimulation package, which flexibly fits the changes of the hydrologicalsystem and represents seasonal hydrological behaviour, could be an ef-ficient, reliable and (potentially) innovative way forward in regionssuch as Scandinavia where the future climate scenarios anticipate

line), CoupModel uncertainty band (grey band), median value HBV (orange line), HBV un-references to colour in this figure legend, the reader is referred to the web version of this

EEvennt 77 EEveent 8

Fig. 4. Period III: Simulated discharge for MIKE SHE (blue line), median value CoupModel (grey line), CoupModel uncertainty band (grey band), median value HBV (orange line), HBVuncertainty band (yellow band). Measured discharge: black dashed line. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version ofthis article.)

327Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

328 Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

alterations to seasonality. In fact,more flexible approaches that bring to-gether aspects of the lumped and distributedmodelsmay be required toadequately address different hydrological conditions in systems whereseasonality is clearly a major factor (e.g., Clark et al., 2008; Feniciaet al., 2008). While more work is needed on this subject, it is clear thatthe priori choices made in hydrological modelling application willhave significant influence on the resultant representation of thehydrological response and subsequent road structure design.

Acknowledgements

This study forms part of an ongoing collaboration between the‘ClimRunoff’ project, which is coordinated by Bioforsk in Norway andfunded by the Norwegian Research Council, and the ‘Adaptation ofroad drainage structures to climate change’ project at the Royal Instituteof Technology in Sweden, conducted within the Centre for Operationsand Maintenance and funded by the Swedish Road Administration(currently the Swedish Transport Administration). Further, SL acknowl-edges funding from the Swedish Research Council (VR) grant 2011-4390. The approach for the data analysis was taken by SL and ZK. Thedata was provided by JS and HF. The setup of the MIKE SHE model wasmade by MS and ZK, the LISEM model by JS, and the CoupModel andthe HBV model by PEJ and ZK.

Appendix A. Supplementary data

Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.scitotenv.2014.09.030.

References

Agnew LJ, Lyon SW, Gérard-Marchant P, Collins VB, Lembo AJ, Steenhuis TS, et al.Identifying hydrologically sensitive areas: bridging the gap between science andapplication. J Environ Manage 2006;78:63–76.

Benzvi A. Toward a new rational method. J Hydraul Eng ASCE 1989;115:1241–55.Bergström S. Development and application of a conceptual runoff model for Scandinavian

catchments. Norrköping: SMHI Reports RHO (7); 1976 [Ph.D. Thesis].Bergström S. The HBV model—its structure and applications. Norrköping: SMHI Reports

RH (4); 1992.Beven KJ. Rainfall–runoff modelling. Chichester: JohnWiley & Sons Ltd; 2000 [19pp,

297pp, 319pp].Boyle DP, Gupta HV, Sorooshian S. Toward improved calibration of hydrologic models:

combining the strengths of manual and automatic methods. Water Resour Res2000;36:3663–74. http://dx.doi.org/10.1029/2000wr900207.

Breuer L, Huisman JA, Willems P, Bormann H, Bronstert A, Croke BFW, et al. Assessingthe impact of land use change on hydrology by ensemble modeling (LUCHEM). I:model intercomparison with current land use. Adv Water Resour 2009;32:129–46.http://dx.doi.org/10.1016/j.advwatres.2008.10.003.

Clark MP, Slater AG, Rupp DE, Woods RA, Vrugt JA, Gupta HV, et al. Framework for under-standing structural errors (FUSE): a modular framework to diagnose differencesbetween hydrological models. Water Resour Res 2008;44. http://dx.doi.org/10.1029/2007wr006735. [DOI: Artn W00b02].

Crump ES. A new method of gauging stream flow with little afflux by means of asubmerged weir of triangular profile. Proc Inst Civ Eng 1952;1(I):223–42.

Dahlke HE, Lyon SW, Stedinger JR, Rosqvist G, Jansson P. Contrasting trends in floods fortwo sub-arctic catchments in northern Sweden – does glacier presence matter?Hydrol Earth Syst Sci 2012;16:2123–41. http://dx.doi.org/10.5194/hess-16-2123-2012.

Dawson CW, Abrahart RJ, See LM. HydroTest: a web-based toolbox of evaluation metricsfor the standardised assessment of hydrological forecasts. Environ Model Software2007;22:1034–52. http://dx.doi.org/10.1016/j.envsoft.2006.06.008.

Deelstra J, Iital A. The use of the flashiness index as a possible indicator for nutrient lossprediction in agricultural catchments. Boreal Environ Res 2008;13:209–21.

Deelstra J, Kværnø SH, Skjevdal R, Vandsemb S, Eggestad HO, Ludvigsen GH. A generaldescription of the Skuterud catchment. Jordforsk report no. 61/05; 2005 [Bioforsk,Ås, Norway].

Deelstra J, Kvaerno SH, Granlund K, Sileika AS, Gaigalis K, Kyllmar K, et al. Runoffand nutrient losses during winter periods in cold climates requirements to nutrientsimulation models. J Environ Monit 2009;11:602–9. http://dx.doi.org/10.1039/B900769p.

Deelstra J, Farkas C, Engebretsen A, Kværnø S, Beldring S, Olszewska A, et al. Can wesimulate runoff from agriculture-dominated watersheds? Comparison of theDrainMod, SWAT, HBV, COUP and INCA models applied for the Skuterud catchment.In: Grzybek A, editor. Bioforsk FOKUS 5 (6), modelling of biomass utilization forenergy purposes; 2010a. p. 119–28.

Deelstra J, Farkas C, Youssef M. Modeling runoff from a small artificially drained agricul-tural catchment in Norway, using the DrainMod model. ASABE's 9th InternationalDrainage Symposium. Québec City, Canada: XVIIth World Congress of CIGR; 2010b[June 13–17].

Doe W, Harmon III R. Introduction to soil erosion and landscape evolution modeling. In:Harmon R, Doe III W, editors. Landscape erosion and evolution modeling. US: Springer;2001. p. 1–14.

Fenicia F, McDonnell JJ, Savenije HHG. Learning from model improvement: on thecontribution of complementary data to process understanding. Water ResourRes 2008;44. http://dx.doi.org/10.1029/2007wr006386. [Artn W06419].

Granger RJ, Gray DM, Dyck GE. Snowmelt infiltration to frozen prairie soils. Can J Earth Sci1984;21:669–77. http://dx.doi.org/10.1139/E84-073.

Green Paper EU. Adapting to climate change in Europe—options for EU action (no. COM(2007) 354 final {SEC (2007) 849}). Brussels: Commission of the European Commu-nities; 2007. p. 5–17.

Gurtz J, Zappa M, Jasper K, Lang H, Verbunt M, Badoux A, et al. A comparative study inmodelling runoff and its components in two mountainous catchments. HydrolProcess 2003;17:297–311. http://dx.doi.org/10.1002/hyp.1125.

Hayashi M, van der Kamp G, Schmidt R. Focused infiltration of snowmelt water in partial-ly frozen soil under small depressions. J Hydrol 2003;270:214–29. http://dx.doi.org/10.1016/S0022-1694(02)00287-1. [Pii S0022-1694(02)00287-1].

Hillel D. Environmental soil physics. San Diego, USA: Academic Press; 1998.Hollander HM, Blume T, Bormann H, Buytaert W, Chirico GB, Exbrayat JF, et al. Compara-

tive predictions of discharge from an artificial catchment (chicken creek) usingsparse data. Hydrol Earth Syst Sci 2009;13:2069–94.

Jansson P-E, Karlberg L. Coupled heat andmass transfer model for soil–plant–atmospheresystems. Sweden: Royal Institute of Technology; 2004.

Jetten V, De Roo APJ. Spatial analysis of erosion conservationmeasures with LISEM. Ch 14.In: Harmon RS, Doe WW, editors. Landscape erosion and evolution modeling. NewYork: Kluwer Academic/Plenum; 2001. p. 429–45.

Jin XL, Xu CY, Zhang Q, Singh VP. Parameter and modeling uncertainty simulated by GLUEand a formal Bayesian method for a conceptual hydrological model. J Hydrol 2010;383:147–55. http://dx.doi.org/10.1016/j.jhydrol.2009.12.028.

Johnsson H, Lundin LC. Surface runoff and soil–water percolation as affected by snow andsoil frost. J Hydrol 1991;122:141–59. http://dx.doi.org/10.1016/0022-1694(91)90177-J.

Kalantari Z, Folkeson L. Road drainage in Sweden: current practice and suggestions for ad-aptation to climate change. J Infrastruct Syst 2013;19:147–56. http://dx.doi.org/10.1061/(Asce)Is.1943-555x.0000119.

Kværnø SH, Deelstra J. Modelling soil frost and snow dynamics under unstable winterclimate. CoupModel simulations in the Skuterud catchment. Jordforsk report no.37/03; 2003 [Ås, Norway].

Kværnø SH, Stolte J. Uncertainty in runoff and erosion simulated by the LISEM model,as affected by soil physical properties input data source; 2012 [Submitted toCATENA].

Kværnø SH, Haugen LE, Børresen T. Variability in topsoil texture and carbon contentwithin soil map units and its implications in predicting soil water content foroptimum workability. Soil Tillage Res 2007;95:332–47. http://dx.doi.org/10.1016/j.still.2007.02.001.

Loague K, Vanderkwaak JE. Simulating hydrological response for the R-5 catchment:comparison of two models and the impact of the roads. Hydrol Process 2002;16:1015–32. http://dx.doi.org/10.1002/Hyp.316.

Lyon SW, Nathanson M, Spans A, Grabs T, Laudon H, Temnerud J, et al. Specific dischargevariability in a boreal landscape. Water Resour Res 2012;48. http://dx.doi.org/10.1029/2011wr011073. [Artn W08506].

Lyon SW, Koutsouris A, Scheibler F, Jarsjö J, Mbanguka R, Tumbo M, et al. Interpretingcharacteristic drainage timescale variability across Kilombero Valley, Tanzania.Hydrol Process 2014. http://dx.doi.org/10.1002/hyp.10304. [in press].

Maidment DR. Handbook of hydrology. New York: McGraw-Hill; 1993.Plesca I, Timbe E, Exbrayat JF, Windhorst D, Kraft P, Crespo P, et al. Model intercomparison

to explore catchment functioning: results from a remote montane tropical rainforest.Ecol Model 2012;239:3–13. http://dx.doi.org/10.1016/j.ecolmodel.2011.05.005.

Reed S, Koren V, Smith M, Zhang Z, Moreda F, Seo DJ, et al. Overall distributed model in-tercomparison project results. J Hydrol 2004;298:27–60. http://dx.doi.org/10.1016/j.jhydrol.2004.03.031.

Refsgaard JC, Knudsen J. Operational validation and intercomparison of different types ofhydrological models. Water Resour Res 1996;32:2189–202. http://dx.doi.org/10.1029/96wr00896.

Refsgaard JC, Storm BIn: Singh MIKESHEVP, editor. Computer models of watershedhydrology. Colorado: Water Resources Publications; 1995. p. 809–46.

Schneider SH, Semenov S, Patwardhan A, Burton I, Magadza CHD, Oppenheimer M, et al.Assessing key vulnerabilities and the risk from climate change. In: Parry ML, CanzianiOF, Palutikof JP, Palutikof PJ, van der Linden PJ, Hanson CE, editors. Climate change2007: impacts, adaptation and vulnerability. Contribution of working group II tothe fourth assessment report of the Intergovernmental Panel on Climate Change(IPCC). Cambridge, UK: Cambridge University Press; 2007. p. 779–810.

Son K, Sivapalan M. Improving model structure and reducing parameter uncertainty inconceptual water balance models through the use of auxiliary data. Water ResourRes 2007;43. http://dx.doi.org/10.1029/2006wr005032. [Artn W01415].

Sorooshian S, Gupta VK. Model calibration. In: Singh VP, editor. Computer models ofwatershed hydrology. Colorado: Water Resources Publications; 1995.

Statens vegvesen. Vegbygging (normaler), Nr. 018 i Statens vegvesens håndbokserie.http://www.vegvesen.no/_attachment/68724/binary/3173, 2011. [In Norwegian].

Steenhuis TS, Collick AS, Easton ZM, Leggesse ES, Bayabil HK, White ED, et al. Predictingdischarge and sediment for the Abay (Blue Nile) with a simple model. Hydrol Process2009;23(26):3651–770. http://dx.doi.org/10.1002/hyp.7513.

329Z. Kalantari et al. / Science of the Total Environment 502 (2015) 315–329

Thue-Hansen V, Grimenes AA. Meteorologiske data for Ås 2007–2008. Ås, Norway:Universitetet for Miljø- og Biovitenskap; 2009.

Vägverket. VVMB 310 hydraulisk dimensionering (in Swedish). Method description.Borlänge: Vägverket; 2008. p. 61 [2008].

Wheater HS. Progress in and prospects for fluvial flood modelling. Philos Trans MathPhys Eng Sci 2002;360(1796):1409–31.