Analysis of Hydrologic Data
Estimation of Design Discharge and Water Level
Estimation of both flood discharges and high water levels are necessary for bank protection design. Careful estimation of discharge and water level is important for all sites with erodible banks. This section describes the methods of assessing flood discharge and water level at the site under consideration. The design discharge and water level are determined for selected probability of exceedance or return period.The design discharge and water level arising from floods should be selected after due consideration of the following: The maximum historical discharge as recorded at the site, or as calculated on the basis of recorded water level at the site, or as calculated on the basis of measured discharge at other points on the river from which corresponding site discharge can reasonably be inferred;
the discharge derived from a frequency analysis using a probability of exceedance or return period which is appropriate to the importance and value of the protection work.
The maximum historical water level as recorded at the site, or as inferred from observed or recorded water level at other points on the river from which level can reasonably be transferred to the site in question;
the water level derived from a frequency analysis using a probability of exceedance or return period which is appropriate to the importance and value of the protection work.
In estimating high flows, primary reliance should be placed on careful field investigations, local enquiries and searches of historical records. Data so obtained should be compared with recorded data for hydrometric stations, and supplemented by analytical procedure using stage-discharge curves: At most hydrometric gauging stations reasonably stable relationship exists between water level and discharge. At some sites, however, the stage discharge curve may be quite unstable because of aggradation or degradation at channel bed or backwater effect from downstream, and may change drastically during major floods. A persistent trend of rising or lowering of curve indicates progressive channel aggradation or degradation. The stage corresponding to design flood which exceeds any recorded flow obtained by extrapolating the stage-discharge relationships.
The most commonly used method for estimating design discharge and water level examines the observed discharge and water level to arrive at suitable estimates. The method, known as frequency analysis, is founded on statistical analyses of discharge and water level records. For locations where records of stream flows are available, or where flows from another basin can be transported to the design location, design flood magnitude and water level can be estimated directly from those records by means of frequency analysis.
Frequency Analysis
Frequency of a hydrologic event, such as the annual peak flow is the probability that a value will be equaled or exceeded in any year. This is more appropriately called the exceedance probability, P(F). The reciprocal of the exceedance probability is the return period T in years, i.e., . The length of record should be sufficient to justify extrapolating the frequency relationship. For example, it might be reasonable to estimate a 50-year flood on the basis of a 30-year record, but to estimate a 100-year flood on the basis of a 10-year record would normally be absurd (Neill 1973). Viessman and Lewis (1996) noted that as a general rule, frequency analysis is cautioned when working with shorter records and estimating frequencies of hydrologic events greater than twice the record length.
Frequency analysis can be conducted in two ways: one is the analytical approach and the other is the graphical technique in which flood magnitudes are usually plotted against probability of exceedance.
Here in the following sections, procedures are given mostly for discharge frequency analysis; the similar procedures can also be followed for water level frequency analysis.
Analytical Frequency Analysis
Analytical frequency analysis is based on fitting theoretical probability distributions to given data. Numerous distributions have been suggested on the basis of their ability to fit the plotted data from streams (Linsley et al. 1988). The Log-Pearson Type III (LP3) has been adopted for use in the United States federal agencies for flood analysis. The first asymptotic distribution of extreme values (EV1), commonly called Gumbel Distribution has been widely used and is recommended in the United Kingdom. EV1 Distribution was found to fit peak flow data for several rivers in Bangladesh (Bari and Saleque 1995).
Extreme Value Distributions: Distributions of the extreme values selected from sets of samples of any probability distribution converge to any one of three forms of Extreme Value Distributions, called Type I, II, and III, respectively, when the number of selected extreme values is large. The three limiting forms are special cases of a single distribution called Generalized Extreme Value (GEV) Distribution. (Chow et al. 1988). The cumulative distribution function for the GEV is
(1)
where , u, and are parameters to be determined. For EVI Distribution x is unbounded, while for EVII, x is bounded from below, and for EVIII, x is bounded from above. The EVI and EVII Distributions are also known as the Gumbel and Frechet Distributions, respectively.The Extreme Value Type I (EVI) cumulative distribution function is
- x (2)The parameters are estimated by
and (3)Eq (2) can be expressed as
(4)where y is the reduced variate defined as
(5)Solving Eq (4) for y:
(6)
Noting that the probability of occurrence of an event is the inverse of its return period T, we can write
so
and substituting for F() into Eq (6)
(7)
For a given return period is related to by Eq (5), or
(8)
Example 1Using the EVI Distribution, a model is developed for frequency analysis of the annual peak flow data of Old Brahmaputra River at Mymensingh for 5, 10, 25 and 50 years return period peak flows are calculated.
Annual peak discharges (m3/s) of the Old Brahmaputra River at Mymensingh for the period from 1964-98YearPeak flowYearPeak flowYearPeak flow
196428301978277019892180
196532301979263019902060
196634901980334019912900
196730001981269019921490
196828101982247019932060
196927701983237019941065
197032501984478019953187
197438201985307019962369
197530601986193019971973
197632101987323019983267
1977355019884910
Sample Size n = 32
Max =4910Ave, =2867.53
Min = 1065Std, s =804.54
Skew, Cs =0.372
Note that data for 1971, 72 and 73 are missing. When a fairly long record has a short gap, it may be justifiable to estimate the missing data by correlation with a nearby station; otherwise it is preferable to consolidate the various recorded sequences as if they formed a continuous record (Neill 1973). The latter approach is used in this example.
For the given data and s = 804.54. Substituting in Eq (3) yields
= 627.62
and
The probability model is
To determine the values of for various values of T, it is convenient to use the reduced variate.
For T = 5 years, Eq (7) gives
and Eq (8) yields = 2505.27 + 627.621.5 = 3446.7 m3/s.
Similarly for other values of T, and values are found as follows:
T=10 years, = 2.25, = 3918 m3/s
T=25 years, = 3.20, = 4513 m3/s
T=50 years, = 3.90, = 4954 m3/sFrequency Analysis using Frequency Factors
Calculating the magnitudes of extreme events by the method outlined in the above example requires that the probability distribution function be invertible, that is, given a value of T or, the corresponding value of can be determined. Some probability distribution functions are not readily invertible, like the Normal and Pearson Type III Distributions. Thus an alternative method based on frequency factor is used for calculating the magnitudes of extreme events. Chow (1951) has shown that most frequency functions can be generalized to
(9)
where is a flood of specified probability or return period T, is the mean of the flood series, s is the standard deviation of the series; and is the frequency factor and is a function of return period and type of probability distribution, as well as coefficient of skewness for skewed distributions, such as LP3.
In the event that the variable analyzed is , for example as in Lognormal and LP3 Distributions, the same method is applied to the statistics for the logarithms of data using , and the required value of is found taking antilog of .
Chow (1951) proposed the frequency factor as in Eq (9), and it is applicable to many probability distributions used in hydrologic frequency analysis. The K-T relationship can be expressed in mathematical terms or by a table.
Normal Distribution: From Eq (9) the frequency factor can be expressed as
(10)
Thus, for Normal Distribution is the same as the standard normal variable z. The value of z and hence can be obtained from Table 2.Lognormal Distribution: The recommended procedure for use of the Lognormal Distribution is to convert the data series to logarithms and compute:
1)
2) Compute the mean, and standard deviation
3) Compute
So, can be taken from Table 2.
4) Finally compute
Log-Pearson Type III (LP3) Distribution: The recommended procedure for use of the LP3 Distribution is to convert the data series to logarithms and compute:
1)
2) Compute the mean, and standard deviation 3) Compute coefficient of skewness
4) Compute (11)
where is taken from Table 3.
5) Finally compute
Table 3 gives values of the frequency factors for the LP3 Distribution for various values of return period and coefficient of skewness, Cs. When Cs =0, the frequency factor is equal to the standard normal variable z (Table 2). Extreme Value I (EVI) Distribution: Chow (1951) derived the following expression for frequency factor for the EVI Distribution
(12)
When, Eq (9) (in population term) gives and Eq (12) gives T=2.33 years. This is the return period of the mean of the EVI Distribution.
Table of frequency factors for the EVI Distribution, given in Table 4, is taken from Haan (1977). The values computed from the above equation are equivalent to an infinite sample size in Table 4.
Example 2For illustration the 5 and 50 years return period annual maximum discharges (m3/s) for the Old Brahmaputra River near Mymensingh is calculated using the Lognormal, Log-Pearson Type III and EVI Distributions.YearPeak flowy = log QYearPeak flowLog QYearPeak flowLog Q
196428303.451786197827703.442479198921803.338456
196532303.509202197926303.419955199020603.313867
196634903.542825198033403.523746199129003.462397
196730003.477121198126903.429752199214903.173186
196828103.448706198224703.392696199320603.313867
196927703.442479198323703.374748199410653.027349
197032503.511883198447803.679427199531873.503382
197438203.582063198530703.487138199623693.374565
197530603.485721198619303.285557199719733.295127
197632103.506505198732303.509202199832673.514149
197735503.550228198849103.691081
Ave, =3.4394Std, =0.1326Skew, Cs =-0.9303
Lognormal Distribution: For T = 50 year, 1/T = 0.02 and Table 2 is entered and z = 2.054 is obtained by interpolation corresponding to the tabular value of Note that the value of frequency factor can be obtained from Table 3 with Cs = 0.
, 3.4394 + 2.0540.1326 = 3.7118, m3/s
Log-Pearson Type III (LP3) Distribution: For Cs = -0.9303, the value of 1.532, so,, 3.4394 +1.5320.1326 = 3.6425, m3/s
Extreme Value I (EVI) Distribution: Eq (12) gives2.592 (however, Table 4 gives3.007 for n=32 years), so,, 2867.5 + 2.592804.54 = 4953 m3/s.
Graphical Frequency Analysis
The frequency of an event can be obtained by use of probability plot, which is a plot of event magnitude versus probability. As a check that a probability distribution fits a set of hydrologic data, the data are plotted on specially designed probability paper that linearizes the distribution function. The plotted data are then fitted with a straight line for interpolation and extrapolation purposes. Determining the probability to assign a data point is commonly referred to as determining probability position.
Plotting Positions: Plotting position refers to the probability value assigned to each piece of data to be plotted. If n is the total number of values to be plotted and m is the rank of a value in a list ordered by descending magnitude, the exceedance probability of the mth largest value is, for large n,
However this simple formula (known as Californias formula) produces a probability of 100%, which implies that the largest sample value is the largest possible value. A value of 100% cannot be plotted on many probability paper (Haan 1977). To overcome this limitation other formulas have been proposed. Several plotting position formulas are given below.
Plotting position formulasCalifornia(m/n)
Hazen(m-0.5)/n
Beard1 - (0.5)1/n
Weibullm/(n+1)
Gringorten(m-0.44)/(n+0.12)
Chegodayev (m-0.3)/(n+0.4)
Blom(m-3/8)/(n+1/4)
Tukey(3m-1/3n+1)
The technique in all cases is to arrange the data in increasing or decreasing order of magnitude and to assign order number m to the ranked values. The most efficient formula for computing plotting positions for unspecified distribution and the one now commonly used for most sample data, is
When m is ranked from lowest to highest, P is an estimate of the probability of values being equal to or less than the ranked value, that is, P(Xx); when the rank is from highest to lowest, P is P(Xx).
Example 3As an example, probability plotting analysis of the annual maximum discharges (m3/s) of the Old Brahmaputra near Mymensingh is performed. Also plotted data are compared with best-fit EVI Distribution. RankPeak flowPlotting position*RankPeak flowPlotting positionRankPeak flowPlotting position
mQPmQPmQP
149100.0171231870.36042324700.702
247800.0491330700.3912423700.733
338200.07971430600.4222523690.765
435500.1111530000.4532621800.7958
534900.1421629000.4842720600.827
633400.1731728300.51562820600.858
732670.2041828100.5472919730.889
832500.2351927700.5783019300.920
932300.2672027700.6103114900.951
1032300.2982126900.6403210650.983
1132100.3292226300.671
Sample size n = 32
Ave = 2867.531
Std dev = 804.5443, Skew = 0.37198
* Gringorten, P = (m0.44)/(n+10.88)
First the data are ranked from largest (m=1), as shown below to smallest (m=n=32). Gringortens plotting formula (b=0.44) was used since data are being fitted to EVI Distribution. For example, for m=1, the exceedance probability (Q 4910 m3/s) = (m0.44)/(n+10.88) = (10.44)/(32+0.12) = 0.56/32.12 = 0.017. Similarly all the plotting positions are calculated and plotted on EVI paper (Fig 6). The plotted points represent the empirical distribution obtained using 32 observed peak flows.
Several points on the best-fit EVI line are calculated using Eq (9) as follows:T=5 years,P(Qq) = 0.20K5 = 0.719,Q5 = 3446 m3/sT =25 years,P(Qq) = 0.04K25 = 2.044,Q25 = 4511 m3/sT =50 years,P(Qq) = 0.02K50 = 2.592,Q50 = 4952 m3/sT =100 years,P(Qq) = 0.01K100 = 3.137,Q100 = 5390 m3/s
A straight line is drawn through the calculated points to obtain the best-fit EVI Distribution line. In this example the plotted points show good-fit with EVI Distribution.
Goodness-of-fit Tests
The goodness of fit of a probability distribution can be tested by comparing the theoretical and sample values of the relative frequency or the cumulative frequency function. In the case of the relative frequency function, the 2 test is used and with cumulative frequency function the Kolmogorov-Smirnov test is used.Chi-Square Test: The test statistic is given by
(13)where k is the number of intervals; the sample value of the relative frequency of interval i is, fs(xi) = ni/n; the theoretical value of the relative frequency function (also called incremental probability function) is p(xi) = F(xi) - F(xi-1). It may be noted that nfs(xi) = ni, the observed number of occurrences in interval i, and np(xi) is the corresponding expected number of occurrences in interval i.
To describe thetest, theprobability distribution must be defined. A distribution with = k-l-1 degrees of freedom (l is the number of parameters used in fitting the proposed distribution) is the distribution for the sum of squares of independent standard normal random variables zi. The critical distribution function is tabulated (in Table 5) from Haan (1977). A confidence level is chosen for the test; it is often expressed as 1-, where is termed the significance level.
Exceedence ProbabilityDischarge (m3/s)Return Period (Years)Fig. 6 EVI probability plot for annual peak flows of Old Brahmaputra, MymensinghExample 4Chi square test is used to determine whether EVI Distribution adequately fits the Old Brahmaputra river annual peak flow data.
Thirty two peak flow observation are divided into six class intervals. The number or frequency of observations, ni in each class is counted. The observed or sample values of relative frequency fs(xi) is calculated with n = 32. For example, for the second class interval fs(x2) = 8/32 = 0.25. The observed cumulative frequency found by summing up the relative frequencies.
To fit EVI Distribution, the parameters and u are calculated as before ( = 627.62, u = 2505.27, = 2867.5 and s = 804.54 m3/s). The theoretical cumulative frequencies corresponding to the upper limit of each of class interval is calculated by finding reduced variate y from Eq (5) and the F(x) by Eq (4). For example, for the second class interval
and p(x2) = P(1750 X 2500) = F(2500) F(1750) = 0.32904
The value of 0.32904 is entered under the expected relative frequency corresponding to the class interval 1750-2500 in the table below.
The calculation is repeated for other class intervals and summed to obtain 2 = 2.35. This is the computed 2 value.
To test the goodness of fit, this is compared with the critical 2 value to be obtained from tabular values as shown below.
Class limitNum of obs.Obs frequencyObs cum frequencyReduced variateExpected cum frequencyExpected relative frequencyChi square
Lower limitUpper limitnifs(xi)Fs(xi)yiF(xi)p(xi)
1000175020.06250.0625-1.20330.035750.035750.64050
1750250080.250.3125-0.00840.364790.329040.60757
25003250140.43750.751.18660.736930.372140.36734
3250400060.18750.93752.38150.911740.174810.02948
4000475010.031250.968753.57650.972420.060680.45676
4750550010.031251.04.77160.991570.019150.24465
Total321.00Computed Chi square 22.3463
For a confidence level of 90%, from Table 5, the critical Chi square for = k-l-1 = 6-2-1 = 3 degree of freedom, 2 = 6.25. Since the computed Chi square value of 2.35 is less than the critical value of 6.25, the data fits EVI Distribution adequately.
Kolmogorov-Smirnov Test: The theoretical and sample values of the cumulative frequency are compared with the Kolmogorov-Smirnov (S-K) test. The test statistic D, which is based on deviations of the sample distribution function P(x) from the completely specified continuous hypothetical distribution function Po(x), such that:
Developed by Kolmogorov (Kite 1988) in 1933, the test requires that the value of D computed from the sample distribution be less than the tabulated value of D (Table 6) at the required confidence level. Kolmogorov-Smirnov test for Gumbels Extremal Distribution gives better result in BangladeshBankful Discharge
The bankful discharge of a river may be defined as the discharge which is contained within the banks of the river. This is the state of maximum velocity in the channel, and therefore of maximum competence for the transport of sediment load.
Bankful discharge is assumed to be a major determinant of the size and shape of a river channel, but it is difficult to measure in the field, and a wide variety of field procedures exist for this measurement. Quoting return periods for bankful discharge is a tricky business because over a dozen methods are available, but the frequency of its occurrence seems to vary with climatic regimes.
Dominant Discharge Analysis
The dominant discharge is the flow doing most geomorphic work and it is, therefore, the channel forming discharge. It probably does not correspond to bankful flow on any river. The dominant or channel forming flow represents an alternative benchmark criterion to bankful flow when analyzing channel form and process. To estimate the dominant discharge the following steps are followed:
1. Obtain long-term (30 year plus) distribution of flows for gauging station.
Frequency, F
2. Split this into discrete of equal class interval. For Brahmaputra, let us try initially 5,000 m3/s class interval, and check sensitivity of results to this choice of interval. Find mid point of each class.
Q F
3. Obtain the most reliable sediment rating curve for the gauging station. Ideally this should be for total load, but a suspended load curve may be used provided that suspended load makes up most of the total load, as is usually the case.
Qs
Q
4. Use the sediment rating curve to find the sediment transport rate (tons/sec) for the mid-point discharge of each flow class.
Qs Qsi
Q
Qi
5. Multiply the sediment transport rate for each discharge class with the frequency of that class to obtain the total sediment load transported by that flow during the period; plot this as a histogram.
Total QsQMode = Qd
6. From the histogram, identify the mode. This corresponds to the dominant discharge. Determine the magnitude of Qd = dominant discharge and use the flow duration curve to establish its return period.
Exceedance Probability
River System and Estimation of Design Water Level and Discharge
381
Table 2 Cumulative probability of the Standard Normal Distribution
Table 3 Frequency factors for Pearson Type III Distribution
Table 3 Continued
Table 4 Frequency factors for Extreme Value I Distribution Sample size (n)Return Period
51015202550751001000
150.9671.7032.1172.4102.6323.3213.7214.0056.265
200.9191.6252.0232.3022.5173.1793.5633.8366.006
250.8881.5751.9632.2352.4443.0883.4633.7295.842
300.8661.5411.9222.1882.3933.0263.3933.6535.727
350.8511.5161.8912.1522.3542.9793.3413.598
400.8381.4951.8662.1262.3262.9433.3013.5545.576
450.8291.4781.8472.1042.3032.9133.2683.520
500.8201.4661.8312.0862.2832.8893.2413.4915.478
550.8131.4551.8182.0712.2672.8693.2193.467
600.8071.4461.8062.0592.2532.8523.2003.446
650.8011.4371.7962.0482.2412.8373.1833.429
700.7971.4301.7882.0382.2302.8243.1693.4135.359
750.9721.4231.7802.0292.2202.8123.1553.400
800.7881.4171.7732.0202.2122.8023.1453.387
850.7851.4131.7672.0132.2052.7933.1353.376
900.7821.4091.7622.0072.1982.7853.1253.367
950.7801.4051.7572.0022.1932.7773.1163.357
1000.7791.4011.7521.9982.1872.7703.1093.3495.261
0.7191.3051.6351.8662.0442.5922.9113.1374.936
2a, vTable 5 2 Distribution
DOF
17.886.635.023.842.711.320.4550.1020.01580.00390.00100.00020.0000
210.69.217.385.994.612.771.390.575.211.103.0506.0201.0100
312.811.39.357.816.254.112.371.21.584.352.216.115.072
414.913.311.19.497.785.393.361.921.06.711.484.297.207
516.715.112.811.19.246.634.352.671.611.15.831.554.412
618.516.814.412.610.67.845.353.452.201.641.24.872.676
720.318.516.014.112.09.046.354.252.832.171.691.24.989
822.020.117.515.513.410.27.345.073.492.732.181.651.34
923.621.719.016.914.711.48.345.904.173.332.702.091.73
1025.223.220.518.316.012.59.346.744.873.943.252.562.16
1126.824.721.919.717.313.710.37.585.584.573.823.052.60
1228.326.223.321.018.514.811.38.446.305.234.403.573.07
1329.827.724.722.419.816.012.39.307.045.895.014.113.57
1431.329.126.123.721.117.113.310.27.796.575.634.664.07
1532.830.627.525.022.318.214.311.08.557.266.265.234.60
1634.332.028.826.323.519.415.311.99.317.966.915.815.14
1735.733.430.227.624.820.516.312.810.18.677.566.415.70
1837.234.831.528.926.021.617.313.710.99.398.237.016.26
1938.636.232.930.127.222.718.314.611.710.18.917.636.84
2040.037.634.231.428.423.819.315.512.410.99.598.267.43
2141.438.935.532.729.624.920.316.313.211.610.38.908.03
2242.840.336.833.930.826.021.317.214.012.311.09.548.64
2344.241.638.135.232.027.122.318.114.813.111.710.29.26
2445.643.039.436.433.228.223.319.015.713.812.410.99.89
2546.944.340.637.734.429.324.319.916.514.613.111.510.5
2648.345.641.938.935.630.425.320.817.315.413.812.211.2
2749.647.043.240.136.731.526.321.718.116.214.612.911.8
2851.048.344.541.337.932.627.322.718.916.915.313.612.5
2952.349.645.742.639.133.728.323.619.817.716.014.313.1
3053.750.947.043.840.334.829.324.520.618.516.815.013.8
4066.863.759.355.851.845.639.333.729.126.524.422.220.7
5079.576.271.467.563.256.349.342.937.734.832.429.728.0
6092.088.483.379.174.467.059.352.346.543.240.537.535.5
70104.2100.495.090.585.577.669.361.755.351.748.845.443.3
80116.3112.3106.6101.996.688.179.371.164.360.457.253.551.2
90128.3124.1118.1113.1107.698.689.380.673.369.165.661.859.2
100140.2135.8129.6124.3118.5109.199.390.182.477.974.270.167.3
Source: Catherine M. Thompson, Table of percentage points of the 2 distribution, Biometrika, Vol. 32 (1941), by permission of the author and publisher.
Table 6 Kolmogorov-Smirnov Distribution Sample size (n)Significance Level
.200.150.100.050.01
1.900.925.950.975.995
2.684.726.776.842.929
3.565.597.642.708.829
4.494.725.564.624.734
5.446.474.510.563.669
6.410.436.470.521.618
7.381.405.438.486.577
8.358.381.411.457.543
9.339.360.388.432.514
10.322.342.368.409.486
11.307.326.352.391.468
12.295.313.338.375.450
13.284.302.325.361.433
14.274.292.314.349.418
15.266.283.304.338.404
16.258.274.295.328.391
17.250.266.286.318.380
18.244.259.278.309.370
19.237.252.272.301.361
20.231.246.264.294.352
25.21.22.24.264.32
30.19.20.22.242.29
35.18.19.21.23.27
40.21.25
50.19.23
60.17.21
70.16.19
80.15.18
90.14
100.14
Asymptotic Formula
Source: Journal American Statistical Association 47:425-441, 1952.Z.W. Birnbaum.