-
Power-Law Adjusted Failure-Time ModelsWilliam J. Reed
Abstract—A simple adjustment to parametric
failure-timedistributions, which allows for much greater
flexibility in theshape of the hazard-rate function, is considered.
Analyticalexpressions for the distributions of the power-law
adjustedWeibull, gamma, log-gamma, generalized gamma, lognormaland
Pareto distributions are given. Most of these allow forbathtub
shaped and other multi-modal forms of the hazardrate. The new
distributions are fitted to real failure-time datawhich exhibit a
multi-modal hazard-rate function and the fitsare compared.
Index Terms—survival analysis; bathtub hazard;
acceleratedfailure time (AFT) regression; power-law
distribution.
I. INTRODUCTIONParametric distributions play an important role
in the
analysis of lifetime data especially in accelerated failure
time(AFT) regression models. Generally speaking analysis basedon a
parametric model will be more precise than that basedon a
nonparametric or semi-parametric model, because it willhave fewer
unknown parameters. However this is contingenton it being possible
to find a suitable parametric model to fitthe data. Unfortunately
for most of the common distributionsemployed there is very little
flexibility in the shape of thehazard rate function. In particular
none of the two-parameterdistributions customarily employed can be
used to model abathtub-shaped hazard.
There are a number of three-parameter distributions whichallow a
bathtub-shaped hazard including the exponentiatedWeibull [3], the
generalized Weibull [4] and the generalizedgamma (see e.g. [1])
distributions. An addition to these wasproposed in a recent article
by Reed [5]. This distribution,which is a special case of a double
Pareto-lognormal distri-bution [6], can be characterised as the
product of independentrandom variables, one with a lognormal
distribution and theother with a power-law distribution on [0, 1].
For this reasonthe new distribution was called the lognormal-power
functiondistribution. It can be thought of as an extension of
thelognormal distribution.
In this article it is shown how any simple
parametricfailure-time distribution can be extended in a similar
wayto allow for much greater flexibility in its form, includingin
most cases the possibility of bathtub shaped hazard-ratefunctions.
Precisely, the failure time T is modelled as theproduct T d= T0U ,
where T0 follows the “simple” failure-time distribution and U
follows the power-law distributionwith density λuλ−1 on [0, 1].
Alternatively this can beexpressed as T d= T0/V where V has a
Pareto distribution,with density λ/vλ+1 on [1,∞).
As might be expected, it is not possible for every
paramet-rically specified distribution (of T0) to obtain an
analytical
Manuscript received March 9, 2012; revised March, 2012. This
work wassupported in part by NSERC Grant OGP 7252.
W. J. Reed is emeritus professor at Department of Mathematics
andStatistics, University of Victoria, PO Box 3060 STN CSC,
Victoria, B.C.,Canada V8W 3R4 e-mail:[email protected]
expression for the resulting power-law modified density.However
it turns out to be possible to do so for a numberof the more common
failure-time distributions including thelognormal (Reed, 2011),
exponential, Weibull, gamma, log-gamma, Pareto and generalized
gamma distributions. Thesedistributions are considered in this
article. In all cases, exceptthe lognormal and Pareto, the
resulting power-function mod-ified densities can be expressed in
terms of an incompletegamma function.
In Sec.2 the distribution theory associated with the power-law
modification is presented, and in Sec.3 maximum likeli-hood
estimation discussed. In Sec.4 the results of fitting thevarious
power-law modified failure-time distributions to datawith a
multi-modal shaped hazard rate, are presented.
II. THEORY
Let T0 be a random variable with a known continuousfailure-time
distribution. The power-law modified form ofthis distribution can
be represented by a random variable Twith
Td= T0U
where U , independent of T0, follows the power-law distri-bution
with density λuλ−1 (λ > 0) on the interval [0, 1].Taking
logarithms leads to
X = log(T )d= Z0 −
1
λE
where Z0 = log T0 (with survivor function and density S0(z)and
f0(z), say) and E is a standard (unit mean) exponentialrandom
variable. The survivor function for X can be foundas a convolution
as follows:
SX(x) = P(Z0 − E/λ ≥ x)= P(E ≤ λ(Z0 − x))= E{P(E ≤ λ(Z0 −
x))|Z0}= E{[1− e−λ(Z0−x)] I[Z0 − x > 0]}
=
� ∞
x[1− e−λ(z−x)]f0(z)dz
= S0(x)− eλx� ∞
xe−λzf0(z)dz (1)
where the expectation E is with respect to Z0 and I is
aBernoulli indicator random variable. Upon integrating byparts one
obtains
SX(x) = λeλx
� ∞
xe−λzS0(z)dz. (2)
From this, by differentiation and using (1), one obtains
thecorresponding formula for the density of X
fX(x) = λeλx
� ∞
xe−λzf0(z)dz. (3)
Proceedings of the World Congress on Engineering 2012 Vol I WCE
2012, July 4 - 6, 2012, London, U.K.
ISBN: 978-988-19251-3-8 ISSN: 2078-0958 (Print); ISSN: 2078-0966
(Online)
WCE 2012
-
From (2) and (3) the survivor function and density of Tin terms
of those of T0 (ST0(t) and fT0(t)) can be easilyobtained:
ST (t) = λtλ
� ∞
tu−λ−1ST0(u)du. (4)
fT (t) = λtλ−1
� ∞
tu−λfT0(u)du. (5)
We now consider power-law modified forms of some
specificfailure-time distributions.
Weibull and exponential model. If T0 has a Weibulldistribution
with hazard rate function hT0(t) = αβtβ−1, itssurvivor function and
density are ST0(t) = exp(−αtβ) andfT0(t) = αβt
β−1 exp(−αtβ). The hazard rate is monotoneincreasing for β >
1 and monotone decreasing for β < 1.In the case β = 1 it is
constant and the Weibull distributionreduces to an exponential
distribution. The survivor functionand density for Z0 = log T0
are
S0(z) = exp(−αeβz) and f0(z) = αβ exp(βz−αeβz).
From (2) and (3), the survivor function and density ofX = log T
, where T follows the power-law adjusted Weibulldistribution,
are
SX(x) =λαλ/β
βeλx I(αeβx,−λ/β)
fX(x) = λαλ/β eλx I(αeβx, 1− λ/β)
where I is the incomplete gamma function
I(y,θ ) =
� ∞
yuθ−1e−udu. (6)
Note that although the ordinary gamma function can beexpressed
as the integral Γ(θ) =
�∞0 u
θ−1e−udu only forθ > 0, the incomplete gamma function I(y,θ )
evaluated aty > 0 converges for all real θ. Thus SX(x) and fX(x)
aboveare well-defined since αeβx > 0.
The survivor function, density and hazard-rate function forT are
easily computed from the above as
ST (t) = SX(log t); fT (t) =1
tfX(log t); hT (t) =
fT (t)
ST (t)
Fig.1 (top row) illustrates three shapes that the hazard
ratefunction of the power-law adjusted Weibull distribution
canassume.
Gamma model. If T0 follows a gamma distribution withscale
parameter θ−1 and shape parameter κ, then the densityand survivor
function of Z0 = log T0 are
S0(z) =I(θez,κ)
Γ(κ)and f0(z) =
θκ
Γ(κ)exp(κz−θez)
From (2) and (3), the survivor function and density ofX = log T
, where T follows the power-law adjusted gammadistribution, are
SX(x) =1
Γ(κ)
�I(θex,κ)− θλeλxI(θex,κ− λ)
�
fX(x) =λθλ
Γ(κ)eλxI(θex,κ− λ)
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0 1 2 3 4 5
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Time
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Haz
ard
rate
0 1 2 3 4 5
02
46
8
Haz
ard
rate
0.0 0.5 1.0 1.5 2.0
02
46
8
Haz
ard
rate
Fig. 1. Some shapes of the hazard rate function for for various
power-lawadjusted distributions. Top row: Weibull distribution with
α = 1: (l.hand)β = 1 (exponential distribution) and λ = 0.02;
(centre) β = 2 and λ = 2;r.hand β = 3 and λ = .02. Second row:
gamma distribution with θ = 0.25:(l.hand) κ = .01 and λ = 1;
(centre) κ = .01 and λ = 2.5; (r.hand)κ = .1 and λ = 7. Third row:
log-gamma distribution with θ = 20:(l.hand) κ = 50 and λ = .01;
(centre) κ = 10 and λ = .01; (r.hand)κ = 5 and λ = .5. Bottom row:
Pareto distribution with τ0 = 1.5: (l.hand)α = 1 and λ = 0.1;
(centre) α = 15 and λ = 2; (r.hand): α = 15 andλ = 0.2
Fig.1 (second row) illustrates some shapes that the hazardrate
function of the power-law adjusted gamma distributioncan
assume.
Log-gamma model. If Z0 = log T0 follows agamma distribution, so
that T0 has density fT0(t) =θκ
Γ(κ) t−(θ+1)(log t)κ−1 with support on [1,∞) then from (2)
and (3), it is easy to show that the power-law adjusted
randomvariable T has support on (0,∞) and that X = log T
hassurvivor function and density
SX(x) =
1− eλx
�θ
θ+λ
�κif x ≤ 0
1Γ(κ)
�I(θx,κ )−
�θ
θ+λ
�κeλxI([θ + λ]x,κ )
�if x > 0
and
fX(x) =
λeλx
�θ
θ+λ
�κif x ≤ 0
λeλx�
θθ+λ
�κI([θ+λ]x,κ)
Γ(κ) if x > 0
Fig.1 (third row) illustrates some shapes that the hazard
ratefunction of the power-law adjusted log-gamma distributioncan
assume.
Pareto model. If T0 follows a Pareto distribution withsupport on
(τ0,∞) and pdf fT0(t) = ατ0
�tτ0
�−(α+1)thereon,
one can show that the power-law adjusted form has supporton
(0,∞) and (using (4)) that the survivor function of thepower-law
adjusted form is
ST (t) =
1− αα+λ�
tτ0
�λif t ≤ τ0
λα+λ
�tτ0
�−αif t > τ0
Proceedings of the World Congress on Engineering 2012 Vol I WCE
2012, July 4 - 6, 2012, London, U.K.
ISBN: 978-988-19251-3-8 ISSN: 2078-0958 (Print); ISSN: 2078-0966
(Online)
WCE 2012
-
and using (5) that the corresponding pdf is
fT (t) =
αλα+λ
1τ0
�tτ0
�λ−1if t ≤ τ0
αλα+λ
1τ0
�tτ0
�−α−1if t > τ0
Fig.1 (bottom row) illustrates some shapes that the hazardrate
function of the power-law adjusted Pareto distributioncan
assume.
Lognormal model. Consider the case where Z0 = log T0follows a
normal distribution with mean µ and variance σ2.Reed (2011) The
power-law adjusted version of this distri-bution (the
lognormal-power function or lNpf distribution)was considered in [5]
where it is shown that the survivorfunction and density of X = log
T , where T follows thelNpf distribution, are
SX(x) = φ
�x− µσ
��R
�x− µσ
�−R
�λσ +
x− µσ
��
andfX(x) = λφ
�x− µσ
�R
�λσ +
x− µσ
�
where R is Mills’ ratio of the complementary
cumulativedistribution function (cdf) to the pdf of a standard
normaldistribution:
R(z) =Φc(z)
φ(z).
Generalized gamma model. The three-parameter general-ized gamma
distribution includes the Weibull, gamma andlognormal models as
special or limiting cases. It has density
fT0(t) = αθκtακ−1 exp(−θtα)/Γ(κ)
With some work using (2) and (3), the survivor functionand
density of X = log T , where T follows the power-lawadjusted gamma
distribution, can be shown to be
SX(x) =1
Γ(κ)
�I(θeαx,κ)− θλ/αeλxI(θeαx,κ− λ/α)
�
fX(x) =λθλ/α
Γ(κ)eλxI(θeαx,κ− λ/α)
It should be noted that while the (unadjusted) log-gammaand
Pareto distributions have support bounded away fromzero, their
power law adjusted versions have support on[0,∞) as indeed occurs
in all of the power law adjustedmodels discussed in this paper.
Thus in these models thereare no problems with the range of support
depending ona parameter, as occurs for example with the
generalizedWeibull distribution.
III. PARAMETER ESTIMATION BY MAXIMUMLIKELIHOOD.
The parametric likelihood for much failure-time data
isproportional to
n�
i=1
[fTi(ti)]δi [STi(ti)]
1−δi
where δi is an indicator variable with value 1 for an
observedfailure time, and value 0 for a right-censored observation.
Ifthere are no covariates and the failure times are considered
0 2000 4000 6000 8000 10000
0e+0
02e−0
44e−0
46e−0
48e−0
4
Time (# of cycles)
Smoo
thed
est
imat
ed h
azar
d ra
te
Fig. 2. Kernel smoothed non-parametric estimate of the hazard
rate functionfor electrical appliances data. The Epanechnikov
kernel with a bandwith of1500 was used. Note that the right-hand
part (> 6000) of the estimatedhazard is unreliable, being based
on only two observations.
to be identically distributed following a power-law
adjusteddistribution with pdf and survivor function fT and ST ,
thenup to an additive constant the log-likelihood is
n�
i=1
δi log fT (ti) +n�
i=1
(1− δi) logST (ti)
which is the same asn�
i=1
δi log fX(log ti)+n�
i=1
(1− δi) logSX(log ti)−n�
i=1
log ti
Thus for each of the models discussed above an
analyticalexpression for the log-likelihood can be obtained. This
willneed to be maximized numerically to obtain maximumlikelihood
estimates using an optimization routine such asoptim in R. For
starting values one can use the MLEs of thetwo parameters of the
unadjusted distribution and an arbitraryvalue (say 1) for λ.
Covariates ZT = (Z1, Z2, . . . , Zp) can be incorporated inan
accelerated failure time (AFT) regression model:
log T = β0 + βTZ +X (7)
where X is a random variable with one of the power-law adjusted
distributions of the previous section. Note thatfor all but the
log-gamma these distributions can be re-parameterized in terms of a
location parameter and two otherparameters. In these cases the
intercept term β0 in (7) is notneeded (and indeed will result in a
non-identifiable model ifit is included).
IV. AN EXAMPLE.
Electrical appliances. Lawless (p. 256) [2] presents data onthe
numbers of cycles to failure for 60 electrical appliancesput on
test. All of the sixty appliances eventually failed,the largest
failure times being 6065 and 9701 cycles. Fig.2shows a
kernel-smoothed non-parametric estimate of thehazard rate for these
data. There is clearly a suggestion ofmulti-modality. To assess and
compare the various power-law adjusted models discussed in the
previous section eachwas fitted to these data. Maximization of the
log-likelihood
Proceedings of the World Congress on Engineering 2012 Vol I WCE
2012, July 4 - 6, 2012, London, U.K.
ISBN: 978-988-19251-3-8 ISSN: 2078-0958 (Print); ISSN: 2078-0966
(Online)
WCE 2012
-
0 2000 4000 6000 8000 10000
0.00
000.
0004
0.00
080.
0012
Time(cycles)
Fitte
d ha
zard
rate
0 2000 4000 6000 8000 10000
0.00
000.
0004
0.00
080.
0012
Time(cycles)
Fitte
d ha
zard
rate
0 2000 4000 6000 8000 10000
0.00
000.
0004
0.00
080.
0012
Time(cycles)
Fitte
d ha
zard
rate
0 2000 4000 6000 8000 10000
0.00
000.
0004
0.00
080.
0012
Time(cycles)
Fitte
d ha
zard
rate
Fig. 3. Maximum likelihood estimates of various power-law
adjusteddistributions for the electrical appliance data. They are
(clockwise fromupper left) Weibull, log gamma, lognormal and
Pareto.
was performed in R using the Nelder-Mead method in theroutine
optim and in all cases required only a minute or twoof
computation.
The values of the maximized log-likelihood and of theAkaike
Information Criterion (AIC) for the power-law ad-justed forms of
the two-parameter models are given inTable 1. In all cases, the
improvement in fit obtained byincluding the power-law adjustment
was highly significant(P
-
in evaluating the incomplete gamma functions which occurin the
distributions discussed in this paper and so the extracomputation
involved might not be too great.
REFERENCES[1] Cox, C., Chu, H., Schneider, M. & Muñoz, A.
“Parametric survival
analysis and taxonomy of hazard functions for the generalized
gammadistribution,”’ Statist. Med. 26, pp. 4352-4374, 2007
[2] Lawless, J. F. Statistical Models and Methods for Lifetime
Data. NewYork: John Wiley and Sons. 1982
[3] Muldholkar, G. S. and D. K. Srivastava, “Exponentiated
Weibull familyfor analyzing bathtub failure-rate data,”’ IEEE
Trans. Rel., 42, pp. 299-302. 1993
[4] Muldholkar, G. S., Srivastava, D. K. & Kollia, G. D. “A
Generalizationof the Weibull distribution with application to the
analysis of survivaldata. J. Amer. Stat. Assoc.”’ 1996. 91, pp.
1575-1583.
[5] Reed, W. J. “A flexible parametric survival model which
allows abathtub shaped hazard rate function”’. J. Appl. Stat. 38,
2011. pp. 1665-1680.
[6] Reed, W. J & Jorgensen, M. “The double Pareto-lognormal
distribution- A new parametric model for size distributions,”’
Comm. Stats - Theory& Methods, 33, 2011. pp. 1733-1753
Proceedings of the World Congress on Engineering 2012 Vol I WCE
2012, July 4 - 6, 2012, London, U.K.
ISBN: 978-988-19251-3-8 ISSN: 2078-0958 (Print); ISSN: 2078-0966
(Online)
WCE 2012