Boise River Flood Frequency Analysis Using Non-Linear Regression

22
BRIAN PORTUGAIS DEPARTMENT OF CIVIL ENGINERERING BOISE STATE UNIVERSITY MAY 12, 2012

Transcript of Boise River Flood Frequency Analysis Using Non-Linear Regression

BRIAN PORTUGAIS

DEPARTMENT OF CIVIL ENGINERERING

BOISE STATE UNIVERSITY

MAY 12, 2012

ii

EXECUTIVE SUMMARY

A mathematical model relating flood recurrence interval (T) and their associated magnitude (QT)

was needed for the Boise River system. Recurrence intervals of importance included 2-, 5-, 10-,

25-, 50-, 75-, 100-, 200-, 250-, 500-, and 1000-year events. To account for safety factors, 90 %

and 95 % confidence limits were calculated.

Flood frequency analysis using the Log Pearson Type III (LP3) distribution and Method of

Moments (MOM) to determine its statistical parameters, was utilized. Concerns of a potential

climactic shift in data records associated with a change in phase of the Pacific Decadal

Oscillation (PDO) were addressed with development of two alternative models. These

functional relationships employed use of a standardized climactic index value, derived from sea

surface temperature anomalies, to account for PDO variability.

Analyses were performed on the annual maximum series of data records for the Boise River

from 1956 through 2012. Procedures for fitting the LP3 distribution were selected from Bulletin

17B (B17) (IACWD 1982). Estimated magnitudes and associated confidence limits from the

LP3 MOM model were approximated well when compared with results from a graphical

spreadsheet regression. Regression equations accounting for PDO variability at 90 % and 95 %

confidence intervals were developed that estimated QT for any T.

iii

TABLE OF CONTENTS

Executive Summary ........................................................................................................................ ii

List of Figures ................................................................................................................................ iv

List of Tables ................................................................................................................................. iv

Introduction ..................................................................................................................................... 1

Methodology ................................................................................................................................... 2

Results ............................................................................................................................................. 3

Discussion & Conclusions .............................................................................................................. 4

References ....................................................................................................................................... 6

Appendix ......................................................................................................................................... 7

LP3 Scenarios Outliers Plots .................................................................................................................. 15

PDO Warm & Cold Phase Regression Graphs ....................................................................................... 16

iv

LIST OF FIGURES

Figure 1. Lower Boise River Watershed ......................................................................................... 1

Figure 2. Empirical Cumulative Distribution vs Theoretical Cumulative Distribution Normal P -

P Plot, Quantile vs. Maximum Flows Normal Q-Q Plot, Relative Frequency Histogram, and

Empirical Cumulative Distribution ......................................................................................... 7

Figure 3. Frequency Curves with 90 % & 95 % Confidence Intervals ......................................... 11

Figure 4. Q - Q Plots Excluding/Including Outliers – Observed & Predicted Magnitudes .......... 12

Figure 5. LP3 Regression Results Including Outlier – The first plot represents the evolution of

the standardized residuals as a function of the dependent variable, the second plot is a bar

chart of the residuals, and the third plot is the observed vs. predicted values ...................... 13

Figure 6. LP3 Regression Results Excluding Outlier – The first plot represents the evolution of

the standardized residuals as a function of the dependent variable, the second plot is a bar

chart of the residuals, and the third plot is the observed vs. predicted values ...................... 14

Figure 7. P-P & Q-Q Plots for Residuals Excluding/Including Outliers ...................................... 15

Figure 8. Pacific Decadal Oscillation Index. Image taken from

www.nwfsc.noaa.gov/research/divisions/fed/oeip/ca-pdo.cfm ............................................ 16

Figure 9. Boise River Annual Maximum Series and PDO Index ................................................. 16

Figure 10.Regression Results PDO Warm Phase ......................................................................... 17

Figure 11. Regression Results PDO Cold Phase .......................................................................... 18

LIST OF TABLES

Table 1. Regression Equations and R2 Coefficient. ....................................................................... 8

Table 2. PDO Cold & Warm Phase Modeled & Predicted Magnitudes ......................................... 9

Table 3. Calculated & Predicted Magnitudes With 90 % and 95 % Confidence Intervals Of LP3

Distributions With & W/O Outlier. ...................................................................................... 10

1 | P a g e P o r t u g a i s

INTRODUCTION

The US Bureau of Reclamation is the governing proprietor of a variety of water resource

infrastructure along the Boise River that provide flood protection for Boise and surrounding

areas known as the Lower Boise River watershed (see Figure 1). The Lower Boise River, a 64-

mile stretch originating at Lucky Peak Dam, flows northwesterly through Ada and Canyon

counties and the cities of Boise and Caldwell (Lower Boise Watershed Council 2012).

Figure 1. Lower Boise River Watershed

To expedite flood frequency calculations, a mathematical relationship was needed that could

estimate the magnitude of any flood given a recurrence interval.

2 | P a g e P o r t u g a i s

Data for the Boise River was collected for the years 1956 through 2012 and an annual maximum

series was developed. Procedures taken from B17 were applied in a spreadsheet program using a

MOM technique to fit the LP3 distribution to the log-transformed annual series. Automated

calculations of B17 procedures, through a statistical software package HEC-SSP, was used to

verify results. Regression analysis on LP3 parameters with climatic indices

(http://jisao.washington.edu/pdo/PDO.latest) incorporated two phase shifts for the PDO.

METHODOLOGY

Equations used to fit the LP3 distribution and determine its parameters can be found in B17. The

general procedure consisted of estimation of the mean, standard deviation, and skew of the

logarithms of the sample data with traditional moment estimators (Griffis and Stedinger, 2007;

IACWD 1982). Because data availability is generally limited, a sample skew and regional skew

along with their mean-square errors were used to obtain a weighted skew (Griffis and Stedinger,

2007; IACWD 1982). Quantile estimates were made at exceedance probabilities for the

aforementioned return periods. The upper and lower confidence limits for levels of significance

of 90 % and 95 % were computed. Frequency curves with confidence limits and estimated

magnitudes were constructed (Appendix A). Outliers are data points which depart significantly

from the trend of the remaining data (Bulletin 17B) and because B17 uses a log-transform, one

or more unusual low-flow values can distort the entire fitted distribution (Griffis and Stedinger,

2007; Stedinger et al., 1992). A test for outliers truncated the data by N = 1, removing year 1977

flow of 3023 cfs.

3 | P a g e P o r t u g a i s

RESULTS

The annual maximum series theoretical cumulative distribution was plotted verses its empirical

cumulative distribution, and its quantile – normal versus maximum magnitudes to visually check

the distributions normality.

Data was fit to the LP3 distribution, with the outlier remaining in the dataset. Quantile estimates

of flood magnitudes were made at T = 2-, 5-, 10-, 25-, 50-, 75-, 100-, 200-, 250-, 500-, and

1000-year events. The 90 % and 95 % lower and upper confidence limits were calculated for

each recurrence interval and curves were plotted on either side of the computed magnitudes.

These curves are descriptive in nature and helpful in displaying the relationship that a known

percentage of data points should fall between. Frequency curves displaying the calculated

magnitudes and their 90 % and 95 % confidence intervals were constructed for the LP3

distributions: with/without outlier, and PDO adjusted scenarios for warm and cold phases.

Nonlinear regression analysis on the function: Y = pr1*Ln(X1) +pr2, was used to model the four

LP3 scenarios. Equations were developed from the mathematical models regression coefficients

and used to predict values of QT at each recurrence interval. Plots of QT versus T, predicted QT

versus calculated QT, and residual charts for each scenario was produced. The spreadsheets

logarithmic trendline feature was plotted on the graphs to compare its equation and R2 with the

models. The model used an iterative procedure and produced the trendlines equation and R2

value, but to 15 decimal places.

To check the quality of the confidence intervals, the normality of residuals was verified and

plotted reasonable well in Q – Q, and P – P graphs. This was completed for the regression

residuals returned from the LP3 with and without outlier scenarios.

4 | P a g e P o r t u g a i s

DISCUSSION & CONCLUSIONS

The PDO is a pattern of climate variability that exhibits phase shifts every 20 – 30 years. To

determine a model for the two-phase shifts, a PDO index was applied to the AMS. Though the

values ranged from approximately -2.25 to 1.75, a binary approach grouped the data as either

being positive or negative. The effectiveness of this is not understood nor may be applicable.

An interesting feature, although, is a plot of the AMS and PDO index. Inferred trends between

the two are visible when the data is plotted on two scales. That may have been a coincidence,

but that the trends follow one another is apparent was surprising. The PDO appeared to be

shifting more sporadically in the last decade, leading to question the value of phase shifting 20 –

30 years.

A better approach would have been to regress the model’s equation using the index as a

parameter. The last shift occurred around 1997 into the cold phase, but decadal trends haven’t

been clear since. The client would be urged to use caution when selecting the provided PDO

since they reduced the number of samples applied to their respective LP3, thus creating an

extreme gap in the ability to place meaningful confidence on estimates. For example, the

confidence interval for the 100-year (full record AMS) magnitude is 28,205 ≤ QT ≤ 41,053 and is

30,003≤ QT ≤ 56,557 for the PDO’s warm phase. A graph of the PDO oscillations is included in

the Appendix.

A potential source for error may have been removing the outlier and not replacing it with perhaps

the mean of the distribution or other techniques that were not explored. However, with the P-P

and Q-Q plots of the residuals from regression analysis, having high R2 values, the method for

removing the outlier and recalculating the LP3 parameters appeared to be superior.

5 | P a g e P o r t u g a i s

Results of the four scenarios models, tables, and figures can be found in the Appendix.

Confidence intervals can be read from the tables with the understanding that they follow this

mathematical relationship:

*, , i L c T U cP Q Q Q

Where Pi is the probability that the estimated magnitude QT is an approximation of the true

frequency curve within the lower QL,c and upper QU,c limits with degrees of confidence, c

(IACWD 1982). QL,c and QU,c along with QT can be taken from the tables in the Appendix.

6 | P a g e P o r t u g a i s

REFERENCES

Bulletin #17B, 1982, Guidelines for Determining Flood Flow Frequencies, Interagency

Advisory Committee on Water Data.

Griffis, V. W., J. R. Stedinger, and T. A. Cohn (2004), Log Pearson type 3 quantile estimators

with regional skew information and low outlier adjustments, Water Resour. Res., 40,

W07503, doi: 10.1029/2003WR002697.

Griffis, V.W., and Stedinger, J.R, 2007, Incorporating Climate Change and Variability into

Bulletin 17B LP3 Model, World Environmental and Water Resources Congress 2007:

Restoring Our Natural Habitat.

Lower Boise Watershed Council. 2012. “Lower Boise Watershed Council.” Accessed April 30.

http://www.lowerboisewatershedcouncil.org/01_who-we-are/watershed-map.html.

Ramsay J.O. and Silverman B.W., 2002. Applied Functional Data Analysis.

Statistical Software Package (HEC-SSP) Version: 2.0, October 2010, U.S. Army Corps of

Engineers, Institute for Water Resources Hydrologic Engineering Center.

PDO Image taken from http://www.nwfsc.noaa.gov/research/divisions/fed/oeip/ca-pdo.cfm.

7 | P a g e P o r t u g a i s

APPENDIX

Figure 2. Empirical Cumulative Distribution vs Theoretical Cumulative Distribution Normal P -

P Plot, Quantile vs. Maximum Flows Normal Q-Q Plot, Relative Frequency Histogram, and

Empirical Cumulative Distribution

8 | P a g e P o r t u g a i s

Table 1. Regression Equations and R2 Coefficient.

Scenario Modeled Regression Equation Iterations R²

PDO Cold Phase QT = 4728.52843281733 x Ln (T) +

10597.5605376217 3 0.997

PDO Warm Phase QT = 6134.95672565557 x Ln (T) +

10220.6478136604 200 0.997

LP3 Inc. Outlier QT = 4391.55086406432 x Ln (T) +

11943.1131118581 200 0.989

LP3 Exc. Outlier QT = 4512.9597371817 × Ln (T) +

11879.5038666905 8 0.992

Scenario Graphical Regression Equation – R²

PDO Cold Phase y = 4728.5 x ln(x) + 10598 – 0.9968

PDO Warm Phase y = 6135 x ln(x) + 10221 – 0.9975

LP3 Inc. Outlier y = 4513 x ln(x) + 11880 – 0.9892

LP3 Exc. Outlier y = 4391.6 x ln(x) + 11943 – 0.9916

9 | P a g e P o r t u g a i s

Table 2. PDO Cold & Warm Phase Modeled & Predicted Magnitudes

PDO Cold Phase 90% C Limits 95% C Limits

T QT QL,c QU,c QL,c QU,c Predicted

2 12,723 11,437 14,170 11,085 14,631 13,876

5 18,317 16,332 21,001 15,847 21,969 18,208

10 21,923 19,286 25,739 18,669 27,178 21,486

25 26,344 22,772 31,808 21,965 33,948 25,818

50 29,536 25,221 36,339 24,264 39,059 29,096

75 31,362 26,601 38,980 25,555 42,057 31,013

100 32,644 27,562 40,853 26,451 44,191 32,374

200 35,692 29,822 45,369 28,553 49,360 35,651

250 36,663 30,535 46,825 29,215 51,034 36,706

500 39,654 32,713 51,358 31,233 56,265 39,984

1000 42,612 34,843 55,911 33,199 61,547 43,261

PDO Warm Phase 90% C Limits 95% C Limits

T QT QL,c QU,c QL,c QU,c Predicted

2 13,171 11,523 15,084 11,079 15,710 14,473

5 20,192 17,492 24,002 16,846 25,424 20,095

10 24,825 21,164 30,383 20,329 32,563 24,347

25 30,566 25,530 38,697 24,422 42,022 29,969

50 34,735 28,606 44,966 27,283 49,250 34,221

75 37,125 30,339 48,638 28,889 53,514 36,709

100 38,802 31,545 51,246 30,003 56,557 38,474

200 42,791 34,377 57,546 32,613 63,946 42,726

250 44,060 35,270 59,580 33,433 66,342 44,095

500 47,966 37,989 65,912 35,927 73,838 48,348

1000 51,819 40,638 72,268 38,349 81,410 52,600

10 | P a g e P o r t u g a i s

Table 3. Calculated & Predicted Magnitudes With 90 % and 95 % Confidence Intervals Of LP3

Distributions With & W/O Outlier.

LP3 Including Outlier 90% C Limits 95% C Limits

T QT QL,c QU,c QL,c QU,c Predicted

2 13,103 12,057 14,256 11,772 14,612 15,008

5 19,167 17,484 21,273 17,061 21,979 19,143

10 22,913 20,686 25,830 20,142 26,839 22,272

25 27,315 24,352 31,346 23,645 32,774 26,407

50 30,364 26,843 35,248 26,014 37,003 29,535

75 32,060 28,215 37,445 27,315 39,392 31,365

100 33,229 29,155 38,969 28,205 41,053 32,663

200 35,941 31,322 42,535 30,252 44,950 35,791

250 36,785 31,992 43,653 30,885 46,175 36,798

500 39,326 33,999 47,040 32,776 49,893 39,927

1000 41,754 35,902 50,306 34,566 53,489 43,055

LP3 Excluding Outlier 90% C Limits 95% C Limits

T QT QL,c QU,c QL,c QU,c Predicted

2 13,325 12,326 14,418 12,053 14,754 14,987

5 19,066 17,496 21,021 17,100 21,675 19,011

10 22,633 20,561 25,335 20,054 26,267 22,055

25 26,863 24,102 30,605 23,442 31,928 26,079

50 29,825 26,533 34,376 25,757 36,007 29,123

75 31,485 27,883 36,515 27,037 38,330 30,904

100 32,635 28,812 38,007 27,919 39,954 32,167

200 35,324 30,969 41,526 29,960 43,793 35,211

250 36,167 31,641 42,637 30,594 45,009 36,191

500 38,724 33,668 46,029 32,506 48,728 39,235

1000 41,193 35,611 49,338 34,335 52,366 42,279

11 | P a g e P o r t u g a i s

Figure 3. Frequency Curves with 90 % & 95 % Confidence Intervals

12 | P a g e P o r t u g a i s

Figure 4. Q - Q Plots Excluding/Including Outliers – Observed & Predicted Magnitudes

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

50,000

10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000

cfs

cfs

Q - Q Plot excluding low outlier

Observed Predicted

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

50,000

10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000

cfs

cfs

Q - Q Plot including outlier

Observed Predicted

13 | P a g e P o r t u g a i s

Figure 5. LP3 Regression Results Including Outlier – The first plot represents the evolution of

the standardized residuals as a function of the dependent variable, the second plot is a bar chart

of the residuals, and the third plot is the observed vs. predicted values

14 | P a g e P o r t u g a i s

Figure 6. LP3 Regression Results Excluding Outlier – The first plot represents the evolution of

the standardized residuals as a function of the dependent variable, the second plot is a bar chart

of the residuals, and the third plot is the observed vs. predicted values

15 | P a g e P o r t u g a i s

LP3 Scenarios Outliers Plots

Figure 7. P-P & Q-Q Plots for Residuals Excluding/Including Outliers

R² = 0.9133

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

The

ore

tica

l cu

mu

lati

ve d

istr

ibu

tio

n

Empirical cumulative distribution

P-P plot Residuals Ex Outliers

R² = 0.8739

-2000

-1500

-1000

-500

0

500

1000

1500

-2000 -1500 -1000 -500 0 500 1000 1500

Qu

anti

le -

No

rmal

(-1

.28

E-1

2, 7

51

.77

)

Residuals

Q-Q plot Residuals Exc Outliers

R² = 0.9133

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

The

ore

tica

l cu

mu

lati

ve d

istr

ibu

tio

n

Empirical cumulative distribution

P-P plot Residuals Inc Outliers

R² = 0.8752

-2000

-1500

-1000

-500

0

500

1000

1500

-2000 -1500 -1000 -500 0 500 1000 1500

Qu

anti

le -

No

rmal

(-5

.79

E-7

, 87

5.5

0)

Residuals

Q-Q plot Residuals Inc Outliers

16 | P a g e P o r t u g a i s

PDO Warm & Cold Phase Regression Graphs

Figure 8. Pacific Decadal Oscillation Index. Image taken from

www.nwfsc.noaa.gov/research/divisions/fed/oeip/ca-pdo.cfm

Figure 9. Boise River Annual Maximum Series and PDO Index

17 | P a g e P o r t u g a i s

Figure 10.Regression Results PDO Warm Phase

18 | P a g e P o r t u g a i s

Figure 11. Regression Results PDO Cold Phase