Top Banner
PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Famoye, Felix] On: 23 October 2008 Access details: Access Details: [subscription number 904522317] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Statistics Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713682269 The beta-Pareto distribution Alfred Akinsete a ; Felix Famoye b ; Carl Lee b a Department of Mathematics, Marshall University, Huntington, WV, USA b Department of Mathematics, Central Michigan University, Mount Pleasant, MI, USA Online Publication Date: 01 December 2008 To cite this Article Akinsete, Alfred, Famoye, Felix and Lee, Carl(2008)'The beta-Pareto distribution',Statistics,42:6,547 — 563 To link to this Article: DOI: 10.1080/02331880801983876 URL: http://dx.doi.org/10.1080/02331880801983876 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
18

The beta-Pareto distribution

Apr 23, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The beta-Pareto distribution

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [Famoye, Felix]On: 23 October 2008Access details: Access Details: [subscription number 904522317]Publisher Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

StatisticsPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t713682269

The beta-Pareto distributionAlfred Akinsete a; Felix Famoye b; Carl Lee b

a Department of Mathematics, Marshall University, Huntington, WV, USA b Department of Mathematics,Central Michigan University, Mount Pleasant, MI, USA

Online Publication Date: 01 December 2008

To cite this Article Akinsete, Alfred, Famoye, Felix and Lee, Carl(2008)'The beta-Pareto distribution',Statistics,42:6,547 — 563

To link to this Article: DOI: 10.1080/02331880801983876

URL: http://dx.doi.org/10.1080/02331880801983876

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

Page 2: The beta-Pareto distribution

Statistics, Vol. 42, No. 6, December 2008, 547–563

The beta-Pareto distribution

Alfred Akinsetea, Felix Famoyeb* and Carl Leeb

aDepartment of Mathematics, Marshall University, Huntington, WV, USA; bDepartmentof Mathematics, Central Michigan University, Mount Pleasant, MI, USA

(Received 5 October 2006; revised version received 7 December 2007; accepted 21 January 2008 )

In this paper, a four-parameter beta-Pareto distribution is defined and studied. Various properties of thedistribution are discussed. The distribution is found to be unimodal and has either a unimodal or a decreasinghazard rate. The expressions for the mean, mean deviation, variance, skewness, kurtosis and entropiesare obtained. The relationship between these moments and the parameters are provided. The method ofmaximum likelihood is proposed to estimate the parameters of the distribution. The distribution is appliedto two flood data sets.

Keywords: unimodality; hazard function; moments; estimation; flood peaks

AMS Classification: 60E05; 62F10; 62P99

1. Introduction

The family of the Pareto distribution is well known in the literature for its capability in mod-elling the heavy-tailed distributions that are mostly common in data on income distribution [1],city population size [2,3], and size of firms [4, p. 151]. Newman [5] also provided many otherquantities measured in the physical, biological, technological and social systems of various kinds,where the Pareto law has been found to be an appropriate fit. The application of extreme valuetheory to the study of environmental time series was studied by Smith [6], who presented a reviewof the statistical concepts in formulating air-quality standards.

The distribution of wealth in the society was investigated by Levy and Levy [7]. They showed,among other results, that general wealth accumulation process in the high wealth range withhomogeneous investment talent leads to the Pareto distribution. The paper by Aban et al. [8]contains a detailed list of important areas where heavy-tailed distributions are found applicable.Some of these areas are in finance, physics, hydrology and engineering. There are also recentapplications of the Pareto distribution in data sets on earthquakes, forest fire areas, fault lengthson Earth and Venus, and on oil and gas fields sizes [9].

Various forms of the Pareto distribution and its generalization exist in the literature. The namegeneralized Pareto distribution (GPD) was first used by Pickands [10] when making statistical

*Corresponding author. Emails: [email protected]; [email protected]

ISSN 0233-1888 print/ISSN 1029-4910 online© 2008 Taylor & FrancisDOI: 10.1080/02331880801983876http://www.informaworld.com

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 3: The beta-Pareto distribution

548 A. Akinsete et al.

inferences about the upper tail of a distribution function. The GPD is found useful in modellingextreme value data because of its long tail feature (see [11] and the references therein). Thedistribution is often called the ‘peaks over thresholds’model since it is used to model exceedancesover threshold level in flood control. The Pareto distribution is a special case of the GPD. ThePareto distribution is also obtained as a special case of another generalized Pareto distribution,which is generated by compounding a heavy-tailed skewed conditional gamma density functionwith parameters α and β−1, where the weighting function for β has a gamma distribution withparameters k and θ [12, p. 193].

A random variable Y has a Pareto distribution [13, p. 574] if its probability density function isgiven by

f (y) = k θk

yk+1, k > 0, θ > 0, y ≥ θ. (1)

The Pareto distribution is skewed to the right and characterized by a shape parameter k and a scaleparameter θ . The density f (y) is a decreasing function of y and achieves its maximum when y issmallest, i.e. when y = θ .

The Pareto distribution has been extended to the transformed Pareto distribution, otherwisecalled the Burr distribution [12, p. 193]. Few other related distributions of the Pareto by wayof some transformations are the exponential distribution, the power function, the logistic andthe chi-square distributions [5, 14, p. 151]. According to Newman [5], when the probability ofmeasuring a particular value of some quantity varies inversely as a power of that value, the quantityis said to follow a Pareto distribution. Also, stable distributions having heavy tails are known tofollow Pareto distribution asymptotically [15].

Many authors including Choulakian and Stephens [11] had claimed the versatility of the GPDin modelling many types of data with long tail. This motivates us to propose in this paper anothergeneralization of the Pareto distribution, referred to as the beta-Pareto distribution (BPD).

In Section 2, we define the BPD and outline some special cases of the distribution.We investigatesome properties of the distribution in Section 3. Some of these properties are the limit behaviour,unimodality and the hazard function of the BPD. Section 4 is devoted to the discussion on themoments of the distribution. The mean deviation from the mean and the mean deviation fromthe median are provided in Section 5. Renyi and Shannon entropies are discussed in Section 6.Section 7 consists of the maximum likelihood estimates of the parameters of the BPD. In Section 8,we provide applications of the BPD to flood data sets. We conclude in Section 9 with some remarkson the main results and their significance.

2. The beta-Pareto distribution

Let F(x) denote the cumulative distribution function (CDF) of a random variable X. The cumu-lative distribution function for a generalized class of distribution for the random variable X, asdefined by Eugene et al. [16] (see also [17]), is generated by applying the inverse CDF to a betadistributed random variable to obtain

G(x) = �(α + β)

�(α) �(β)

∫ F(x)

0tα−1(1 − t)β−1 dt, 0 < α, β < ∞. (2)

The corresponding probability density function for G(x) is given by

g(x) = 1

B(α, β)[F(x)]α−1[1 − F(x)]β−1F ′(x), (3)

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 4: The beta-Pareto distribution

Statistics 549

where

B(α, β) = �(α) �(β)

�(α + β).

Various authors have studied the density in Equation (3) by assuming various cumulativedistribution functions F(x) (see [16, 18–20]).

In the present study, we let F(x) be the CDF of the Pareto random variable with density functiondefined in Equation (1). By using Equation (1), the Pareto CDF is

F(x) = 1 −(x

θ

)−k

.

From Equation (3), the probability density function for the beta-Pareto random variable is given by

g(x) = k

θ B(α, β)

{1 −

(x

θ

)−k}α−1 (x

θ

)−kβ−1, x ≥ θ, α, β, θ, k > 0. (4)

By setting (x/θ)−k = y in Equation (4), it is not difficult to show that∫ ∞θ

g(x) dx = 1. If a randomvariable X has the density in Equation (4), we shall write X ∼ BP(α, β, θ, k).

The CDF in Equation (2) of the beta-Pareto random variable, denoted as G(x), can bere-expressed in what follows: let G∗(x) = 1 − G(x), then, G∗(x) for the BPD with densityfunction given in Equation (4) is

G∗(x) =∫ ∞

x

k

θ B(α, β)

{1 −

(t

θ

)−k}α−1 (

t

θ

)−kβ−1

dt, t ≥ θ.

By setting y = (t/θ)−k , the above integration becomes

G∗(x) =∫ z

0

1

B(α, β)yβ−1(1 − y)α−1 dy = B(z; β, α)

B(α, β), 0 < z < 1,

where B(z; β, α) is an incomplete beta function with z = (x/θ)−k . Hence,

G(x) = 1 − B(z; β, α)

B(α, β)

= 1 − zβ

B(α, β)

{1

β+ 1 − α

β + 1z + · · · + (1 − α)(2 − α) · · · (n − α)

n!(β + n)zn + · · ·

}, (5)

by using the infinite series expansion for incomplete beta function [21]. Equation (5) is anotherexpression for the CDF of a BPD. For example, when α = 1 = β, Equation (5) reduces to1 − (x/θ)−k , which is the CDF of the Pareto distribution.

2.1. Special cases

We consider some special cases of the BPD in what follows:

(a) Case 1: When α = β = 1, the BPD in Equation (4) reduces to the Pareto distribution inEquation (1) with parameters k and θ .

(b) Case 2: When α = 1, the BPD with parameters α, β, θ and k reduces to the Pareto distributionwith parameters kβ = c and θ , with density function

f (x) = c θc

xc+1, c > 0, θ > 0, x ≥ θ.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 5: The beta-Pareto distribution

550 A. Akinsete et al.

(c) Case 3:

THEOREM 1 If X ∼ BP(α, β, θ, k), then the random variable Y = (X/θ)−k has the arcsinedistribution when α = 1/2 = β.

Proof Using the transformation method, it is easy to show that the random variable Y hasthe arcsine density function given by f (y) = (1/π)(1/

√y(1 − y)), 0 < y < 1. �

(d) Case 4:

THEOREM 2 If X ∼ BP(α, β, θ, k), then the random variable Y = β ln(X/θ) has the logbetadistribution, with parameters α, k and k/β. That is, Y ∼ logbeta(α, k, k/β).

Proof Using the transformation method, we can show that the random variable Y has thedensity function given by

f (y) = k

β B(α, β)

(1 − exp

(−ky

β

))α−1

exp(−yk), 0 < y < ∞. (6)

According to Dufresne [22], if Y ∼ logbeta(a, b, c), then

f (y) = κ(a, b, c)e−by(1 − e−cy)a−1, y > 0, where κ(a, b, c) = c �(a + b/c)

�(a) �(b/c). �

(e) Case 5:

COROLLARY 1 If X ∼ BP(α = 1, β, θ, k), then the random variable Y = β ln(X/θ) followsthe exponential distribution with mean 1/k.

Proof By setting α = 1 in Equation (6), the result follows. �

(f) Case 6:

COROLLARY 2 If X ∼ BP(α, β, θ, k), then the probability density function of the randomvariable Y = β ln(X/θ) is an integral representation of a beta function when k = β, or aspecial case of the beta-Weibull distribution, BW(α, β, c, γ ).

Proof When k = β in Theorem 2, the probability density function of the random variableY may be written as

f (y) = 1

B(α, β)e−yβ (1 − e−y)α−1, 0 < y < ∞.

This is a special case of the beta-Weibull distribution, BW(α, β, c, γ ), discussed in Famoyeet al. [18], when c = 1 = γ . The density is also an integral representation of the beta function,since according to Zwillinger and Kokoska [14, p. 551],∫ ∞

0

1

B(p, q)e−tp(1 − e−t )q−1 dt = 1, Re(p) > 0, Re(q) > 0. �

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 6: The beta-Pareto distribution

Statistics 551

2.2. Other types of beta-Pareto distributions

Johnson et al. [13] discussed various types of Pareto distributions other than the Pareto density inEquation (1). The density in Equation (1) is called the Pareto Type I. The cumulative distributionfunctions of Pareto Types II, III and IV are, respectively, given by Johnson et al. [13] as

F2(x) = 1 −(

1 + x

C

)−a

, x > 0, C, a > 0,

F3(x) = 1 − C e−bx

(x + C)a, x > 0, C, a, b > 0,

and

F4(x) = 1 −[

1 +(

x − μ

σ

)1/λ]−ξ

, x > μ, ξ, λ, σ > 0.

By applying the distribution functions F2(x), F3(x) and F4(x) in Equation (3), the correspondingtypes of beta-Pareto density functions can be defined, respectively, as

g2(x) = a/C

B(α, β)

{1 −

(1 + x

C

)−a}α−1 (

1 + x

C

)−aβ−1,

g3(x) = b + a/(x + C)

B(α, β)

{1 − C e−bx

(x + C)a

}α−1 (C e−bx

(x + C)a

,

and

g4(x)= ξ

σλB(α, β)

(x − μ

σ

)1/λ−1⎧⎨⎩1−

[1+

(x − μ

σ

)1/λ]−ξ

⎫⎬⎭

α−1 {1+

(x − μ

σ

)1/λ}−ξβ−1

.

3. Some properties of the BPD

We investigate some properties of the BPD in this section.

3.1. Limit behaviour

LEMMA 1 The limit of beta-Pareto density as x → ∞ is 0 and the limit as x → θ is given by

limx→θ

g(x) =

⎧⎪⎪⎨⎪⎪⎩

∞, when 0 < α < 1,

θ, when α = 1,

0, when α > 1.

Proof It is straightforward to show the above from the beta-Pareto density in Equation (4). �

3.2. Transformation

THEOREM 3 If Y is a beta random variable with parameters α and β, then the random variable

X = θ(1 − Y )−1/k

has a BPD with parameters α, β, θ and k.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 7: The beta-Pareto distribution

552 A. Akinsete et al.

Proof Using the transformation method, we can show that the random variable X has aprobability density function that is given in Equation (4). �

3.3. Unimodality

THEOREM 4 The BPD is unimodal at x0. When 0 < α ≤ 1, the point x0 = θ and when α ≥ 1 thepoint x0 is given by

x0 = θ

(kβ + 1 + k(α − 1)

kβ + 1

)1/k

.

Proof The derivative of Equation (4) is given by

g′(x) = k

θ2B(α, β)

[1 −

(x

θ

)−k]α−2 (x

θ

)−kβ−2

×{k(α − 1)

(x

θ

)−k − (kβ + 1)

[1 −

(x

θ

)−k]}

. (7)

By equating Equation (7) to zero and solving for x, the first bracket involving x yields a criticalpoint x0 = θ . The second bracket gives x0 = 0. This value cannot be a modal point, sincex ≥ θ > 0. Equating the third bracket involving x to zero, we have

k(α − 1)(x

θ

)−k − (kβ + 1)

[1 −

(x

θ

)−k]

= 0.

The value of x in this equation satisfies

x0 = θ

(kβ + 1 + k(α − 1)

kβ + 1

)1/k

.

Since x ≥ θ , it is required that [(kβ + 1 + k(α − 1)) /(kβ + 1)]1/k ≥ 1 implying that α ≥ 1. Ifα = 1, then x0 = θ , which agrees with the critical point from the first bracket in Equation (7). If0 < α < 1, the BPD is a decreasing function of x and its maximum is at the point x0 = θ . Thisconcludes the proof. �

The graphs of beta-Pareto density in Figure 1 show that the function is unimodal for the variousvalues of α. Note that the BPD reduces to the Pareto distribution when α = 1 = β and the Paretodistribution has its mode at the point x = θ .

3.4. Hazard function

By definition, the hazard function (or failure rate) of a random variable X with density g(x) anda cumulative distribution function G(x) is given by

h(x) = g(x)

1 − G(x),

where g(x) and G(x) are given by Equations (4) and (5), respectively. Using Equations (4) and (5),the hazard function of the BPD may be expressed as

h(x) = k/x(1 − (x/θ)−k

)α−1(1/β + ((1 − α)/(β + 1))(x/θ)−k + · · ·

+((1 − α)(2 − α) · · · (n − α)/n!(β + n))(x/θ)−nk + · · · ).

It is noted that by setting α = 1 = β in h(x), we have the hazard function of the Pareto distribution.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 8: The beta-Pareto distribution

Statistics 553

Figure 1. Density functions for β = k = 2, θ = 3 and values of α.

LEMMA 2 The limit of beta-Pareto hazard function as x → ∞ is 0 and the limit as x → θ isgiven by

limx→θ

h(x) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

∞, when 0 < α < 1,

θ, when α = 1,

0, when α > 1.

Proof It is straight forward to show the results of Lemma 2 by taking the limit of beta-Paretohazard function. �

LEMMA 3 The BPD has a unimodal failure rate when α > 1 and a decreasing failure rate when0 < α ≤ 1.

The proof of Lemma 3 follows from Lemma 2. The graphs of the beta-Pareto hazard function (orfailure rate) are provided in Figure 2.

4. Moments of the BPD

It is easier to find the moment of the quantity (X/θ)r first, and with appropriate manipulation,obtain the moment of Xr .By definition,

E

(X

θ

)r

= k

θ B(α, β)

∫ ∞

θ

{1 −

(x

θ

)−k}α−1 (x

θ

)−kβ+r−1dx.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 9: The beta-Pareto distribution

554 A. Akinsete et al.

Figure 2. Hazard functions for β = k = 2, θ = 3 and values of α.

The beta integral enables us to write the rth moment as

E(Xr) = θr B(α, β − r/k)

B(α, β). (8)

When r = 1 in Equation (8), the mean of BPD may be expressed as

μ = E(X) = θ

{B(α, β − 1/k)

B(α, β)

}= θ�(α + β)

�(β)

�(β − 1/k)

�(α + β − 1/k). (9)

It is interesting to note that when α = 1 = β, Equation (9) reduces to

E(X) = θ k

k − 1,

which is the mean of the Pareto distribution. By using appropriate moment expressions, thevariance (σ 2), skewness (α3) and kurtosis (α4) of the BPD may be written, respectively, as

σ 2 = θ2

{B(α, β − 2/k)

B(α, β)−

(B(α, β − 1/k)

B(α, β)

)2}

= θ2�(α + β)

�(α + β − 2/k)

�(β − 2/k)

�(β)− μ2,

(10)

α3 = B2(α, β)B(α, β − 3/k) − 3B(α, β)B(α, β − 1/k)B(α, β − 2/k) + 2B3(α, β − 1/k){B(α, β)B(α, β − 2/k) − B2(α, β − 1/k)

}3/2 ,

(11)

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 10: The beta-Pareto distribution

Statistics 555

α4 = T (B){B(α, β)B(α, β − 2/k) − B2(α, β − 1/k)

}2 , (12)

where,

T (B) = B3(α, β)B

(α, β − 4

k

)− 4B2(α, β)B

(α, β − 3

k

)B

(α, β − 1

k

)

+ 6B(α, β)B

(α, β − 2

k

)B2

(α, β − 1

k

)− 3B4

(α, β − 1

k

).

From Equation (9), the mean of BPD is defined when β > 1/k and from Equation (10), thevariance of BPD is defined when β > 2/k. From Equations (9) and (10), we observe that boththe mean and the variance are increasing functions of θ . To examine how the mean and the variancechange for other parameter values, we provide Table 1 which contains the mean and variance ofthe BPD for various parameter values. From Table 1, we notice that both the mean and the varianceof the BPD are increasing functions of α and θ , and are both decreasing functions of β and k.

From Equation (11), the skewness of BPD is defined when β > 3/k and from Equation (12), thekurtosis of BPD is defined when β > 4/k. Both the skewness and the kurtosis are free of parameterθ . Table 2 contains the skewness and kurtosis of the BPD for various values of parameters α, β

and k. We notice that both the skewness and the kurtosis are decreasing functions of α, β and k.

Table 1. Mean and variance of beta-Pareto for various values of α, β, k and θ .

β = 0.75 β = 2.0 β = 5.0

α k θ μ σ 2 μ σ 2 μ σ 2

0.75 5.0 1.0 1.3023 0.2395 1.0873 0.0124 1.0319 0.00152.0 2.6046 0.9581 2.1746 0.0494 2.0637 0.00585.0 6.5116 5.9882 5.4364 0.3088 5.1593 0.0364

7.5 1.0 1.1806 0.0628 1.0563 0.0048 1.0210 0.00062.0 2.3611 0.2512 2.1125 0.0191 2.0420 0.00253.0 5.9028 1.5700 5.2814 0.1193 5.1049 0.0154

10.0 1.0 1.1287 0.0283 1.0415 0.0025 1.0156 0.00032.0 2.2575 0.1130 2.0830 0.0100 2.0313 0.00143.0 5.6437 0.7065 5.2076 0.0627 5.0782 0.0084

2.0 5.0 1.0 1.5396 0.4074 1.1905 0.0251 1.0776 0.00342.0 3.0792 1.6298 2.3810 0.1003 2.1552 0.01365.0 7.6879 10.1861 5.9524 0.6269 5.3879 0.0851

7.5 1.0 1.3165 0.0975 1.1213 0.0092 1.0507 0.00142.0 2.6330 0.3898 2.2425 0.0367 2.1015 0.00563.0 6.5826 2.4363 5.6063 0.2297 5.2537 0.0350

10.0 1.0 1.2238 0.0420 1.0889 0.0047 1.0377 0.00082.0 2.4476 0.1676 2.1779 0.0188 2.0754 0.00303.0 6.1189 1.0490 5.4446 0.1177 5.1885 0.0190

5.0 5.0 1.0 1.8310 0.6208 1.3504 0.0429 1.1636 0.00672.0 3.6620 2.4834 2.7007 0.1715 2.3272 0.02705.0 9.1549 15.5211 6.7518 1.0718 5.8179 0.1685

7.5 1.0 1.4760 0.1336 1.2188 0.0145 1.1057 0.00262.0 2.9521 0.5342 2.4376 0.0581 2.2114 0.01063.0 7.3802 3.3390 6.0939 0.3634 5.5284 0.0660

10.0 1.0 1.3328 0.0546 1.1590 0.0072 1.0780 0.00142.0 2.6656 0.2184 2.3179 0.0287 2.1561 0.00563.0 6.6640 1.3649 5.7948 0.1793 5.3902 0.0349

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 11: The beta-Pareto distribution

556 A. Akinsete et al.

Table 2. Skewness and kurtosis of beta-Pareto for various values of α, β and k.

β = 1.0 β = 2.0 β = 5.0 β = 7.5

α k α3 α4 α3 α4 α3 α4 α3 α4

0.75 5.0 4.980 82.874 3.127 21.060 2.580 13.740 2.483 12.7027.5 3.506 28.201 2.783 16.167 2.481 12.682 2.422 12.077

10.0 3.066 20.274 2.638 14.416 2.434 12.206 2.392 11.78415.0 2.725 15.557 2.508 12.974 2.390 11.761 2.363 11.504

2.0 5.0 4.111 60.527 2.262 13.033 1.695 7.806 1.595 7.1107.5 2.790 20.077 1.954 9.880 1.601 7.151 1.535 6.721

10.0 2.389 14.267 1.822 8.760 1.556 6.859 1.506 6.54015.0 2.076 10.834 1.703 7.844 1.513 6.587 1.477 6.368

5.0 5.0 3.769 52.915 1.866 10.225 1.234 5.691 1.116 5.1077.5 2.499 17.323 1.564 7.707 1.135 5.194 1.050 4.812

10.0 2.109 12.244 1.435 6.820 1.087 4.976 1.018 4.67615.0 1.802 9.258 1.316 6.100 1.041 4.774 0.986 4.548

7.5 5.0 3.691 51.280 1.768 9.616 1.107 5.2270 0.979 4.6667.5 2.431 16.734 1.467 7.240 1.005 4.770 0.910 4.395

10.0 2.044 11.813 1.337 6.407 0.955 4.570 0.876 4.27115.0 1.737 8.924 1.217 5.733 0.908 4.387 0.842 4.155

5. The mean deviation

The deviation from the mean (in the case of symmetric distributions) or deviation from the median(in the case of skewed distributions) can be used as a measure of spread in a population. Let X bea beta-Pareto random variable with mean μ = E(X) and median M . The mean deviation fromthe mean and the mean deviation from the median are defined, respectively, by

D(μ) = E{| X − μ |} =∫ ∞

θ

| X − μ | g(x) dx,

and

D(M) = E{| X − M |} =∫ ∞

θ

| X − M | g(x) dx.

The mean deviation from the mean can be simplified as

D(μ) =∫ ∞

θ

| X − μ | g(x) dx =∫ μ

θ

(μ − x)g(x) dx +∫ ∞

μ

(x − μ)g(x) dx

= 2∫ μ

θ

(μ − x)g(x) dx

= 2μG(μ) − 2∫ μ

θ

xg(x) dx. (13)

Similarly, the mean deviation from the median can be simplified as

D(M) =∫ ∞

θ

| x − M | g(x) dx =∫ M

θ

(M − x)g(x) dx +∫ ∞

M

(x − M)g(x) dx

= 2∫ M

θ

(M − x)g(x) dx +∫ ∞

θ

(x − M)g(x) dx

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 12: The beta-Pareto distribution

Statistics 557

= 2MG(M) − 2∫ M

θ

xg(x) dx + E(X − M)

= μ + 2MG(M) − M − 2∫ M

θ

xg(x) dx. (14)

By using the generalized binomial expansion

(1 + s)α =∞∑i=0

i

)si, where

i

)= α(α − 1)(α − 2) · · · (α − i + 1)

i! ,

we can evaluate the integrals in Equations (13) and (14). Thus, we have∫ c

θ

xg(x) dx = k

θB(α, β)

∫ c

θ

x(x

θ

)−kβ−1{

1 −(x

θ

)−k}α−1

dx,

= k

θB(α, β)

∫ c

θ

x(x

θ

)−kβ−1 ∞∑i=0

(α − 1

i

)(−1)i

(x

θ

)−ki

dx

= kθ

B(α, β)

∞∑i=0

(−1)i(

α − 1

i

)1 − (c/θ)1−k(β+i)

k(β + i) − 1, (15)

where c = μ or M . By using Equation (15) in Equations (13) and (14), the mean deviation fromthe mean and the mean deviation from the median are, respectively, given by

D(μ) = 2μG(μ) − 2kθ

B(α, β)

∞∑i=0

(−1)i(

α − 1

i

)1 − (μ/θ)1−k(β+i)

k(β + i) − 1, (16)

and

D(M) = μ + 2MG(M) − M − 2kθ

B(α, β)

∞∑i=0

(−1)i(

α − 1

i

)1 − (M/θ)1−k(β+i)

k(β + i) − 1. (17)

6. The Renyi and Shannon entropies

The entropy of a random variable is a measure of variation of the uncertainty [18, 20]. Entropyhas been used in various situations in science and engineering, and numerous measures of entropyhave been studied and compared in the literature.

For a beta-Pareto random variable X with density g(x), the Renyi entropy is defined by,

IR(ξ) = 1

1 − ξlog

{∫gξ (x) dx

}, (18)

where ξ > 0 and ξ = 1. By using the beta-Pareto density, we have

I =∫ ∞

θ

gξ (x) dx = kξ

[θ B(α, β)]ξ∫ ∞

θ

(x

θ

)−ξ(kβ+1){

1 −(x

θ

)−k}ξ(α−1)

dx. (19)

By using the substitution y = 1 − (x/θ)−k , Equation (19) may be written as

I =(

k

θ

)ξ−1B (ξ(α − 1) + 1, (β + 1/k)ξ − 1/k)

Bξ (α, β). (20)

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 13: The beta-Pareto distribution

558 A. Akinsete et al.

The Renyi entropy can now be written as

IR(ξ) = − log

(k

θ

)+ 1

1 − ξlog

{B (ξ(α − 1) + 1, (β + 1/k)ξ − 1/k)

Bξ (α, β)

}. (21)

A special case of Equation (18) is defined by Shannon [23] as E{− log(g(Y ))}. This is obtainedby taking the limit of the Renyi entropy when ξ → 1.

E{− log(g(X))} = limξ→1

{1

1 − ξlog

[∫gξ (x) dx

]}.

By taking the limit of Equation (21) as ξ → 1, using the L’Hospital’s rule and simplifying theresult, we obtain

E{− log(g(Y ))} = − log

(k

θ

)− (α − 1)�(α) −

(β + 1

k

)�(β)

+(

α − 1 + β + 1

k

)�(α + β) + log B(α, β), (22)

where �(z) = �′(z)/�(z) is a digamma function.

7. Maximum likelihood estimates of the parameters

The log-likelihood function of BPD may be expressed as

ln L(x; α, β, θ, k) = n ln k − n ln θ + n (ln �(α + β) − ln �(α) − ln �(β))

+ (α − 1)

n∑j=1

ln

[1 −

(xj

θ

)−k]

− (kβ + 1)

n∑j=1

ln(xj

θ

). (23)

Differentiating Equation (23) with respect to k, α and β, respectively, and setting the results equalto zero, we have

∂ ln L(x)

∂k= n

k−

n∑j=1

{β + (α − 1)

[1 −

(xj

θ

)k]−1

}ln

(xj

θ

)= 0, (24)

∂ ln L(x)

∂α= n {�(α + β) − �(α)} +

n∑j=1

ln

[1 −

(xj

θ

)−k]

= 0, (25)

and

∂ ln L(x)

∂β= n {�(α + β) − �(β)} − k

n∑j=1

ln(xj

θ

)= 0. (26)

Since x ≥ θ , the maximum likelihood estimate of θ is the first-order statistic x(1). The maximumlikelihood estimates α, β and k for the parameters α, β and k, respectively, are obtained by solvingiteratively Equations (24)–(26). The initial estimates of α, β and k can be obtained as follows: fitthe Pareto density to the data. The maximum likelihood of θ is θ = x(1), the first-order statistic andthe maximum likelihood of k is k = n[∑ ln(xi/θ)]−1. By using θ and k, we transform the datato beta density data and then find the maximum likelihood estimates of α and β or the moment

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 14: The beta-Pareto distribution

Statistics 559

estimates of α and β from the beta density. The initial estimates for the BPD are the moment ormaximum likelihood estimates of α and β, and the estimate k. By using Equations (24)–(26), thesecond partial derivatives may be expressed as

∂2 ln L(x)

∂k∂β= −

n∑j=1

ln(xj

θ

),

∂2 ln L(x)

∂k∂α=

n∑j=1

[(xj

θ

)k − 1

]−1

ln(xj

θ

),

∂2 ln L(x)

∂k2= − n

k2− (α − 1)

n∑j=1

{ln(xj /θ)

1 − (xj /θ)k

}2 (xj

θ

)k

,

∂2 ln L(x)

∂α∂β= n � ′′(α + β),

∂2 ln L(x)

∂α2= n

{� ′(α + β) − � ′(α)

},

∂2 ln L(x)

∂β2= n

{� ′(α + β) − � ′(β)

}.

These second partial derivatives can be used to compute the Fisher’s information matrix. Theexpectations of the second partial derivatives are not in close form. A numerical method can beemployed to obtain the expectations.

8. Applications of the BPD

In this section, the BPD is fitted to data sets from two rivers. These are the exceedances of floodpeaks, discussed in Choulakian and Stephens [1], and the flood data illustrated by Mudholkarand Huston [24] in the application of the exponentiated Weibull distribution. These sets of dataare fitted by using the Pareto distribution, the three-parameter Weibull, the generalized Paretodistribution and the beta-Pareto distribution.

8.1. The Wheaton River data

The data are the exceedances of flood peaks (in m3/s) of the Wheaton River near Carcross inYukon Territory, Canada. The data consist of 72 exceedances for the years 1958–1984, roundedto one decimal place. This data were analysed by Choulakian and Stephens [11] and are given inTable 3. The distribution is highly skewed to the right. Choulakian and Stephens [11] proposed theGPD defined as f (x) = (1/a)(1 − kx/a)(1−k)/k , where the range of x is 0 ≤ x < ∞ for k ≤ 0and 0 ≤ x ≤ a/k for k > 0. This distribution is often called the ‘peaks over thresholds’ modelsince it is used to model exceedances over threshold level in flood control. For more details aboutthe source of the data and the GPD, one may refer to Choulakian and Stephens [11].

The data are fitted by using (a) the Pareto distribution, (b) the three-parameter Weibulldistribution, (c) the GPD and (d) the BPD proposed in this article. The Kolmogorov–Smirnov(K–S) goodness-of-fit statistic is used for the comparison of the fits. The parameters are estimatedby the maximum likelihood technique. The maximum likelihood estimates and the P-values basedon the K–S goodness-of-fit statistics are given in Table 4. Figure 3 displays the empirical and the

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 15: The beta-Pareto distribution

560 A. Akinsete et al.

Table 3. Exceedances of Wheaton River flood data.

1.7 2.2 14.4 1.1 0.4 20.6 5.3 0.7 1.9 13.0 12.0 9.31.4 18.7 8.5 25.5 11.6 14.1 22.1 1.1 2.5 14.4 1.7 37.60.6 2.2 39.0 0.3 15.0 11.0 7.3 22.9 1.7 0.1 1.1 0.69.0 1.7 7.0 20.1 0.4 2.8 14.1 9.9 10.4 10.7 30.0 3.65.6 30.8 13.3 4.2 25.5 3.4 11.9 21.5 27.6 36.4 2.7 64.01.5 2.5 27.4 1.0 27.1 20.2 16.8 5.3 9.7 27.5 2.5 27.0

Table 4. Parameter estimates and K–S statistics for Wheaton River data.

Three-parameter Generalized Beta-ParetoDistribution Pareto Weibull Pareto

Maximum likelihood k = 0.2438, Shape = 0.8471, Scale = 11.20, k = −0.00093 α = 7.6954, β = 85.75,estimates θ = 0.1 Threshold = 0.099 a = 12.193, k = 0.0208, θ = 0.1

K–S statistics 2.7029 1.6734 1.205 1.2534

P-value <0.000 0.0074 0.1094 0.0864

Figure 3. CDFs for Wheaton River data.

four fitted cumulative distribution functions. Both the GPD and BPD fit the data adequately, whilethe three-parameter Weibull and the Pareto distributions do not provide adequate fit to the data.

8.2. The Floyd River data

The second flood data is for the Floyd River located in James, Iowa, USA. The Floyd Riverflood rates for the years 1935–1973 are provided in Table 5. The application of the exponentiatedWeibull distribution was illustrated by using this data [24]. For more details and the source of thedata, see [24].

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 16: The beta-Pareto distribution

Statistics 561

Table 5. Annual flood discharge rates of the Floyd River.

Years Flood discharge in (ft3/s)

1935–1944 1460, 4050, 3570, 2060, 1300, 1390, 1720, 6280, 1360, 74401945–1954 5320, 1400, 3240, 2710, 4520, 4840, 8320, 13900, 71500, 62501955–1964 2260, 318, 1330, 970, 1920, 15100, 2870, 20600, 3810, 7261965–1973 7500, 7170, 2000, 829, 17300, 4740, 13400, 2940, 5660

Table 6. Parameter estimates and K–S statistics for the Floyd River flood data

Distribution Pareto Three-parameter Weibull Generalized Pareto Beta-Pareto

Maximum likelihoodestimates

k = 0.4125, θ = 318 Shape = 0.7461,Scale = 5301,Threshold = 318.0

k = –0.3071, a = 4520 α = 6.1550,β = 24.2434,k = 0.0926,θ = 318

K–S statistics 1.9505 0.7444 0.7037 0.5838P -value <0.000 0.6366 0.7050 0.8849

The four distributions shown in Table 6 are applied to model this data. Famoye et al. [18]fitted the data using the beta-Weibull and compared the fit with that of two-parameter Weibull andthe exponentiated Weibull distributions. Their study indicated that both the beta-Weibull and theexponentiated Weibull distributions fit the data equally well. The log-likelihood value is −376.35for the exponentiated Weibull distribution and −382.13 for the beta-Weibull distribution. Thelog-likelihood value for the beta-Pareto model is −365.45. The BPD provides the best fit to theflood data with a P -value of 0.8849 for the K–S statistics, followed successively by the GPD andthe three-parameter Weibull distribution. Figure 4 displays the empirical and four fitted cumulative

Figure 4. CDFs for Floyd River flood data.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 17: The beta-Pareto distribution

562 A. Akinsete et al.

distribution functions for the data. The three-parameter Weibull, the GPD and the BPD provideadequate fits. The fit from the Pareto distribution is not adequate.

9. Conclusion

We define and study the BPD. Various properties of the distribution are investigated. Thedistribution is found to be unimodal and reduces to some existing distributions that are known inthe literature. The method of maximum likelihood is suggested for estimating the parameters ofthe BPD.

In one application to flood data, the BPD provides a significantly better fit than the Pareto andthe three-parameter Weibull distributions; and in the other application, the BPD is equally as goodas the GPD. The GPD as discussed in Choulakian and Stephens [11] was developed specificallyfor modelling the exceedances over threshold level in flood control. The BPD is not developedspecifically for fitting the ‘peak over threshold’ data. However, the versatility of the BPD (whichhas two additional parameters) enables it to adequately fit this type of data. The result in thispaper indicates that the BPD appears to be a good model in handling extreme-value data. Furtherstudy is needed to explore the applications of BPD to other situations where heavy- and long-tail distributions have been used. The development of goodness-of-fit statistics for differentiatingbetween various distributions and the BPD will be taken up in a future work.

Acknowledgements

The authors express their gratefulness to the referees for valuable comments and suggestions which have enhanced thepaper.

References

[1] N. Krishnaji, Characterization of the Pareto distribution through a model of underreported incomes, Econometrica38(2) (1970), pp. 251–255.

[2] G. Alperovich, An explanatory model of city-size distribution: Evidence from cross-country data, Urban Stud. 30(9)(1993), pp. 1591–1601.

[3] W.J. Reed, On the rank-size distribution for human settlements, J. Regional Sci. 42(1) (2002), pp. 1–17.[4] J.L. Devore, Probability and Statistics for Engineering and the Sciences, 6th ed., Duxbury, Ontario, 2004.[5] M.E.J. Newman, Power laws, Pareto distributions and Zipf’s law, Contemp. Phys. 46(5) (2005), pp. 323–351.[6] R.L. Smith, Extreme value analysis of environmental time series: An application to trend detection in ground-level

ozone, Statist. Sci. 4(4) (1989), pp. 367–393.[7] M. Levy and H. Levy, Investment talent and the Pareto wealth distribution: Theoretical and experimental analysis,

Rev. Econ. Statist. 85(3) (2003), pp. 709–725.[8] I.B. Aban, M.M. Meerschaert, and A.K. Panorska, Parameter estimation for the truncated Pareto distribution, J.

Am. Statist. Assoc. 101(473) (2006), pp. 270–277.[9] S.M. Burroughs and S.F. Tebbens, Upper-truncated power law distributions, Fractals 9(2) (2001), pp. 209–222.

[10] J. Pickands, Statistical inference using extreme order statistics, Ann. Statist. 3(1) (1975), pp. 119–131.[11] V. Choulakian and M.A. Stephens, Goodness-of-fit for the generalized Pareto distribution, Technometrics 43(4)

(2001), pp. 478–484.[12] R.V. Hogg, J.W. McKean, and A.T. Craig, Introduction to Mathematical Statistics, 6th ed. Pearson Prentice-Hall,

New Jersey, 2005.[13] N.L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, 2nd ed., vol. 1, Wiley, NewYork,

1994.[14] D. Zwillinger and S. Kokoska, Standard Probability and Statistics Tables and Formulae, Chapman & Hall/CRC,

Florida, 2000.[15] H. Fofack and J.P. Nolan, Tail behavior, modes and other characteristics of stable distributions, Extremes 2(1)

(1999), pp. 39–58.[16] N. Eugene, C. Lee, and F. Famoye, The beta-normal distribution and its applications, Commun. Statist. Theory

Meth. (2002), pp. 497–512.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008

Page 18: The beta-Pareto distribution

Statistics 563

[17] K.P. Singh, C.M. Lee, and E.O. George, On generalized log-logistic model for censored survival data, Biomet. J.30 (1988), pp. 843–850.

[18] F. Famoye, C. Lee, and O. Olumolade, The beta-Weibull distribution, J. Statist. Theory Appl. 4(2) (2005),pp. 121–136.

[19] S. Nadarajah and S. Kotz, The beta Gumbel distribution, Math. Probab. Eng. 10 (2004), pp. 323–332.[20] S. Nadarajah and S. Kotz, The beta exponential distribution, Reliability Eng. Syst. Safety 91 (2006), pp. 689–697.[21] E.W. Weisstein, Incomplete beta function, (2003). Available at MathWorld–A Wolfram Web Resource

http://mathworld.wolfram.com/IncompleteBetaFunction.html.[22] D. Dufresne, Fitting combination of exponentials to probability distributions, First version of a Working Paper,

Center for Actuarial Studies, University of Melbourne, 2005, pp. 1–23.[23] C.E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J. 27 (1948), pp. 379–432.[24] G.S. Mudholkar and A.D. Huston, The exponentiated Weibull family: Some properties and a flood data application,

Commun. Statist. Theory Meth. 23(4) (1996), pp. 1149–1171.

Downloaded By: [Famoye, Felix] At: 22:37 23 October 2008