On the discrete analogues of continuous distributions

16
(This is a sample cover image for this issue. The actual cover is not yet available at this time.) This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Transcript of On the discrete analogues of continuous distributions

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

This article appeared in a journal published by Elsevier. The attachedcopy is furnished to the author for internal non-commercial researchand education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/copyright

Author's personal copy

Statistical Methodology 9 (2012) 589–603

Contents lists available at SciVerse ScienceDirect

Statistical Methodology

journal homepage: www.elsevier.com/locate/stamet

On the discrete analogues of continuous distributionsAyman Alzaatreh a, Carl Lee b, Felix Famoye b,∗

a Department of Mathematics & Statistics, Austin Peay State University, Clarksville, TN 37044, United Statesb Department of Mathematics, Central Michigan University, Mount Pleasant, MI 48859, United States

a r t i c l e i n f o

Article history:Received 29 July 2011Received in revised form27 March 2012Accepted 28 March 2012

Keywords:T-geometric familyExponentiated-exponential–geometricdistribution

Simulation studyApplications

a b s t r a c t

In this paper, a new method is proposed for generating discretedistributions. A special class of the distributions, namely, theT -geometric family contains the discrete analogues of continuousdistributions. Some general properties of the T -geometric familyof distributions are obtained. A member of the T -geometric family,namely, the exponentiated-exponential–geometric distribution isdefined and studied. Various properties of the exponentiated-exponential–geometric distribution such as the unimodality,the moments and the probability generating function are dis-cussed. The method of maximum likelihood estimation is pro-posed for estimating the model parameters. Three real data setsare used to illustrate the applications of the exponentiated-exponential–geometric distribution.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

Discrete distributions are very important for modeling real life scenarios. Many research papershave been published on the study and applications of discrete distributions. A large number ofdiscrete distributions can be found in [10,3,5]. Various techniques for generating families of discretedistributions have been developed.

Katz [11] developed a discrete analogue of the Pearson continuous systemby using the relationshippx+1

px=

a + bx1 + x

, x = 0, 1, 2, . . . .

A more general extension of the Katz family is the Kemp families of distributions. The family ofgeneralized hypergeometric probability distributions by Kemp [12] generated a wide variety ofexisting discrete distributions. For detailed discussion, one may refer to Johnson et al. [10, Chapter 2].

∗ Corresponding author. Tel.: +1 989 774 5497; fax: +1 989 774 2414.E-mail address: [email protected] (F. Famoye).

1572-3127/$ – see front matter© 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.stamet.2012.03.003

Author's personal copy

590 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

The Lagrangian family of discrete distributions, generated by using the Lagrangian expansion,is another important technique for generating discrete distributions, which was studied by Consul,Shenton and their collaborators beginning in the early 1970s. For detailed discussion, one may referto the books by Consul [4] and Consul and Famoye [5]. More recently, Li et al. [20] relaxed someconditions on Lagrangian expansion and used it to generate the family of generalized Lagrangianprobability distributions.

Some discrete analogues of continuous distributions were developed by using the form

P(X = k) = f (k) ∞

u=−∞

f (u), k = 0, ±1, ±2, . . . ,

where f is the probability density function of a continuous random variable. Examples include thediscrete normal distribution studied by Kemp [13] and the discrete Laplace distribution studied byInusah and Kozubowski [9].

Roknabadi et al. [22] defined the telescopic family of distributions as the one with probabilityfunction P(X = x) = qkθ (x)

− qkθ (x+1), x = 0, 1, 2, . . . , where 0 < q < 1 and kθ (x) is strictlyincreasing function of xwith kθ (0) = 0 and kθ (x) → ∞ as x → ∞. They showed that if a continuousrandom variable belongs to the extended exponential family (e.g. exponential, Rayleigh andWeibull)with cumulative distribution function G(t) = 1 − exp[−αkθ (t)], then the discrete versions of thesecontinuous distributions are members of the telescopic family of distributions.

In his study of discrete reliability measures, Roy [23] pointed out that geometric distributionis a discrete analogue of exponential distribution. Subsequently, Roy [24,25] proposed a methodfrom reliability perspective to discretize continuous distributions and studied discrete analogues ofRayleigh and normal distributions. Denote the survival function of a continuous random variable X byS(x) = P(X ≥ x). If times are grouped into unit intervals, the discrete observed variable dX = [X],the largest integer less than or equal to X , has the probability function

P(dX = x) = p(x) = P(x ≤ X < x + 1) = S(x) − S(x + 1), x = 0, 1, 2, . . . . (1.1)

Krishna and Pundir [18] applied this method to study the discrete Burr and discrete Paretodistributions. The result in (1.1) belongs to the telescopic family of Roknabadi et al. [22].

In this paper, we propose another technique to generate new families of discrete distributions.A new discrete distribution generated using this technique is studied in detail. In Section 2, wedefine and study some properties of the family of discrete analogue of the distribution for any non-negative continuous random variable, namely, the T -geometric distribution. In Section 3, a memberof T -geometric family, the exponentiated-exponential–geometric distribution (EEGD) is defined andstudied. In Section 4, themethod ofmaximum likelihood estimation (MLE) is proposed to estimate theEEGD parameters. In Section 5, the likelihood ratio test, the Wald test and the score test are proposedto compare the geometric distributionwith the EEGD. A simulation study is conducted to evaluate theperformance of the three tests. Applications of the EEGD to real data sets are provided in Section 6.

2. Discrete analogues of distributions of non-negative continuous random variables

Let F(x) be the cumulative distribution function (CDF) of any random variable X and let r(t) be theprobability density function of a continuous random variable T defined on [0, ∞). The CDF of the T -Xfamily of distributions defined by Alzaatreh et al. [1] is given as

G(x) =

− log(1−F(x))

0r(t)dt = R {− log (1 − F(x))} , (2.1)

where R(t) is the CDF of the random variable T . Alzaatreh et al. [1] in their paper call the familyof distributions defined in (2.1) the ‘Transformed-Transformer’ family (or T -X family). When X is acontinuous random variable, the probability density function of the T -X family is

g(x) =f (x)

1 − F(x)r (− log (1 − F(x))) = h(x) r (H(x)) .

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 591

Note that h(x) is the hazard function for the random variable X with CDF F(x) and the function(− log(1 − F(x))) is the corresponding cumulative hazard function, H(x). This family of distributionscan also be viewed as a new family of distributions generated by the composite of T and H(X). If X isa discrete random variable, the T -X family is a family of discrete distributions transformed from thenon-negative continuous random variable T . By using (2.1), the probabilitymass function (pmf) of theT -X family of discrete distributions can be written as

g(x) = G(x) − G(x − 1) = R {− log (1 − F(x))} − R {− log (1 − F(x − 1))} . (2.2)

Here, we provide the connection between the discrete T -X family and the family of Lagrangianprobability distributions. The pmf of the class of general Lagrangian probability distributions of thefirst kind [5] is defined by

P(X = x) =

f1(0), x = 01x!

dx−1

dzx−1

(f2(z))x

ddz

f1(z)

z=0, x = 1, 2, . . . (2.3)

where f1(z) and f2(z) are two analytic functions of z with f1(1) = f2(1) = 1 and f2(0) = 0. The T -Xfamily can be generated from the general Lagrangian distribution in (2.3) by taking f2(z) = 1 andf1(z) =

x=0 zxg(x) where g(x) is the pmf of the T -X family in (2.2).

In the following, we define the T -geometric family and show that this family of distributionscorresponding to the discrete analogue of continuous distributions. If a random variable X followsa geometric distribution with parameter p, the CDF of X is given by

F(x) = 1 − px+1, x = 0, 1, 2, . . . . (2.4)

In (2.4), X denotes the number of failures to obtain the first success and p is the probability of a failure.By using (2.4), the pmf of the T -geometric family in (2.2) is

g(x) = R (c(x + 1)) − R(cx), x = 0, 1, 2, . . . , where c = − log p > 0. (2.5)

When c = 1, the result in (1.1) is a special case of (2.5) by writing S(x) as 1 − R(x).As illustrated in (2.2), to generate a discrete T -X distribution, one needs to obtain the discrete

cumulative hazard function of X . The probability at each x-value of the T -X distribution is evaluatedat theH(x) of the CDF of the random variable T . When X has a geometric distribution, (2.5) shows thatthe cumulative hazard function can be simplified and the T -geometric family is the discrete analogueof the distribution of any non-negative continuous random variable.

Theorem 1. If T is a non-negative continuous random variable with CDF R(t) and X is a geometricrandom variable, then the pmf of the T-geometric family is

g(x) = R (c(x + 1)) − R(cx).

Proof. On using (2.4), g(x) in (2.2) reduces to

g(x) = R (−(x + 1) log(p)) − R (−x log(p)) = R (c(x + 1)) − R(cx), x = 0, 1, 2, . . . ,

where c = − log p > 0. �

Theorem 1 is an important result that defines the discrete distribution analogue to the distributionof any non-negative continuous random variable T . The following are some examples of applyingTheorem 1 to generate the discrete distribution analogue to a non-negative continuous randomvariable T .

Example 1. If T is a Pareto random variable on [0, ∞) with CDF R(t) = 1 − θ k/(t + θ)k, the pmf ofthe discrete Pareto distribution is

g(x) =λk

(x + λ)k−

λk

(x + λ + 1)k, x = 0, 1, 2, . . . , where λ = θ/c > 0 and k > 0. (2.6)

When λ = 1, (2.6) reduces to the discrete Pareto distribution defined by Krishna and Pundir [18].

Author's personal copy

592 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

Example 2. If T is the Weibull random variable with CDF R(t) = 1 − e−(t/γ )α , the pmf of the discreteWeibull distribution is

g(x) = e−(x/θ)α− e−((x+1)/θ)α , x = 0, 1, 2, . . . , where θ = γ /c > 0 and α > 0. (2.7)

On replacing e−θ−αby q, (2.7) can be written as g(x) = qx

α− q(x+1)α , x = 0, 1, 2, . . . which is the

discrete Weibull distribution defined by Nakagawa and Osaki [21]. Nakagawa and Osaki [21] showedthat the discreteWeibull distribution can be used in reliabilitymodels. Some properties of the discreteWeibull distribution have been studied by Khan et al. [15], Stein and Dattero [27], and Kulasekera [19].When α = 2, the discrete Weibull distribution reduces to the discrete Rayleigh distribution definedby Roy [25].

Example 3. If T is the Burr (Type XII) random variable with CDF R(t) = 1 − (1 + tα)−β where t > 0,the pmf of the discrete Burr distribution is

g(x) = (1 + (cx)α)−β− (1 + (c(x + 1))α)−β , x = 0, 1, 2, . . . , where α, β, c > 0. (2.8)

Another discrete Burr distribution was obtained by Krishna and Pundir [18]. Their distribution is aspecial case of (2.8) by taking c = 1. When α = c = 1, (2.8) reduces to the discrete Pareto in (2.6)with parameters λ = 1 and k = β .

In (2.5), c is a scale parameter which is 1 when p = e−1. The pmf in (2.5) is the difference betweentwo successive CDFs of the random variable T . These successive CDFs are scaled down when c < 1or scaled up for c > 1. When c = 1, there is no scaling and g(0) = R(1) − R(0) = R(1), g(1) =

R(2)− R(1), . . . , g(k) = R(k+ 1)− R(k), etc. Lemma 1 shows the relation between the non-negativecontinuous random variable T and the non-negative discrete random variable X for the T -geometricdistribution in (2.5).

Lemma 1. If a non-negative continuous random variable T has the CDF R(t), then the non-negativediscrete random variable X = [T/c] has the discrete T-geometric family of distributions, where [u] isthe largest integer less than or equal to u.

Proof. We have,

P(X = x) = P([T/c] = x) = P(cx ≤ T < c(x + 1)) = R(c(x + 1)) − R(cx), x = 0, 1, 2, . . . .

Hence, P(X = x) = g(x) in (2.5). �

Lemma 2 and Theorem 2 show the relation between the shape of the distribution of the randomvariable T and the shape of the distribution of the random variable X for the T -geometric family ofdistributions.

Lemma 2. If the distribution of a non-negative continuous random variable T has a reversed J-shape, thenthe T-geometric distribution has a reversed J-shape.

Proof. We need to show that for any two non-negative integers x < y, then g(x) > g(y), whereg(x) = R(c(x + 1)) − R(cx). Since c is a constant, without loss of generality we assume c = 1, andhence g(x) − g(y) = R(x + 1) − R(x) − [R(y + 1) − R(y)]. By using the Mean Value theorem, thereexist two real numbers ξ1 and ξ2 such that g(x) − g(y) = r(ξ1) − r(ξ2), where ξ1 ∈ (x, x + 1) andξ2 ∈ (y, y + 1). Since x < x + 1 ≤ y < y + 1, then ξ1 < ξ2 and hence r(ξ1) > r(ξ2). �

Notice that when T does not have a reversed J-shape, the shape of g(x) may be reversed J-shape. Anexample is the graph of EEGD when θ = 0.25 and α = 2 in Fig. 1. While the graph of the EEGDis a reversed J-shape when α = 2, the graph of the exponentiated-exponential distribution is not areversed J-shape.

Theorem 2. If the distribution of the non-negative continuous random variable T is unimodal with aunique mode m, then the T-geometric family of distributions is unimodal.

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 593

Fig. 1. The Graphs of the EEG pmf for various values of θ (theta) and α (alpha).

Proof. Without loss of generality, we assume c = 1. If r(t) has a reversed J-shape, then Lemma 2shows that g(x) has a reversed J-shape and hence unimodal. Now suppose r(t) does not have areversed J-shape. Since r(t) has a unique mode at m, then r(t) is strictly increasing for all t < mand strictly decreasing for all t > m. Now let x be a non-negative integer, if x ≥ [m]+1 then Lemma 2shows that g(x) ≤ g([m] + 1). If x ≤ [m] − 1, then by using the Mean Value theorem it can be shownthat g(x) ≤ g([m] − 1). To show this, we make and proof the following claim : g([m]) is greater thanor equal g([m] − 1) or g([m] + 1).Proof of the claim: Suppose that g([m]) < g([m] − 1) and g([m]) < g([m] + 1), then R([m] + 1) −

R([m]) < R([m]) − R([m] − 1) and R([m] + 1) − R([m]) < R([m] + 1) − R([m] + 2). By the MeanValue theorem there are ξ1 ∈ ([m] − 1, [m]), ξ2 ∈ ([m], [m] + 1) and ξ3 ∈ ([m] + 1, [m] + 2) suchthat

r(ξ2) < r(ξ1) and r(ξ2) < r(ξ3). (2.9)

Sincem is the mode for r(t), (2.9) implies that ξ2 = m and so ξ2 < m or ξ2 > m. If we assume ξ2 < m,then we have ξ1 < ξ2 < m. Since r(t) is strictly increasing for all t < m we get r(ξ1) < r(ξ2) whichcontradicts (2.9). Similarly, assuming ξ2 > m also contradicts (2.9). Hence g([m]) ≥ g([m] − 1) org([m]) ≥ g([m] + 1) and this ends the proof of the claim.

Author's personal copy

594 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

Now, the bimodality of g(x) can only occur when g([m]) < g([m] − 1) and g([m]) < g([m] + 1).Since this is impossible, g(x) has a unique maximum value which is the highest value among g([m]−

1), g([m]) and g([m] + 1). Hence, g(x) is unimodal. �

Corollary 1. If the non-negative continuous random variable T has a unique mode m > 0, then the modefor the T-geometric distribution is at the point [m] − 1, [m], or [m] + 1. If [m] = 0, the mode forT-geometric distribution is at the point 0 or 1.

The hazard function associated with the T -geometric family of distribution in (2.5) is

hg(x) =R(c(x + 1)) − R(cx)1 − R(c(x + 1))

=1 − R(cx)

1 − R(c(x + 1))− 1. (2.10)

Theorem 3. If the distribution of a random variable T has an increasing failure rate (IFR) (or a decreasingfailure rate (DFR)), then the T-geometric distribution has an IFR (or a DFR).

Proof. Without loss of generality assume c = 1, then g(x) = R(x + 1) − R(x), x = 0, 1, 2, . . .Roy and Dasgupta [26] showed that for any continuous CDF, F(t), the discretized distribution of theform

P(X = x) = F(x + δ) − F(x − [1 − δ]), x ∈ Z and δ ∈ [0, 1], (2.11)

has an IFR [or a DFR] when the distribution of the random variable T has an IFR [or a DFR]. The resultsfollow by taking δ = 1 in (2.11) and this completes the proof. �

3. Definition and some properties of the EEGD

Gupta and Kundu [7] defined and studied the exponentiated-exponential distribution. If a randomvariable T follows the exponentiated-exponential distribution, then its CDF is given by R(t) = (1 −

e−λt)α, t > 0, α > 0, λ > 0. If X is a random variable that follows the geometric distribution in (2.4),then the T -X family in (2.2) leads to the exponentiated-exponential–geometric distribution with thepmf

g(x) =1 − pλ(x+1)α

−1 − pλxα , x = 0, 1, 2, . . . . (3.1)

On replacing pλ by θ , (3.1) can be written as

g(x) =1 − θ x+1α

−1 − θ xα , x = 0, 1, 2, . . . (3.2)

where 0 < θ < 1.A random variable X with the pmf g(x) in (3.2) is said to have the exponentiated-

exponential–geometric distribution (EEGD) with parameters α and θ . Note that if α = 1, that is therandom variable T has exponential distribution, then the EEGD reduces to the geometric distributionwith probability of failure being θ . This property shows that the geometric distribution is the discreteanalogue of exponential distribution. Roy [23] noticed this property in his study of reliabilitymeasures(see also [10, p. 210]). In Fig. 1, various graphs of g(x) are provided. These plots indicate that the EEGDis unimodal and right skewed.The following lemma gives the relation between EEGD, exponentiated-exponential distribution andexponential distribution.

Lemma 3 (Transformations).

(a) If Y has the exponentiated-exponential distribution with parameters α and− log θ (0 < θ < 1), thenthe random variable X = [Y ] follows the EEGD with parameters α and θ .

(b) If Y follows the exponential distribution with mean −1/ log θ (0 < θ < 1) then the random variableX =

logθ

1 − (1 − θY )1/α

follows the EEGD with parameters α and θ .

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 595

(c) If Y follows the exponential distribution with mean α, then the random variable X = [logθ (1 − (1 −

e−Y/α)1/α)] follows the EEGD with parameters α and θ .

Proof. The proof is similar to the proof of Lemma 1 with T being exponentiated-exponential randomvariable. �

The following lemma gives sufficient condition for the EEGD to have a reversed J-shape.

Lemma 4. If α ≤ 1, then the distribution of the EEGD has a reversed J-shape.

Proof. Since the exponentiated-exponential distribution has a reversed J-shape for all α ≤ 1, hencethe result follows from Lemma 2. �

The following theorem shows that the EEGD is unimodal.

Theorem 4. The EEGD is unimodal. When α ≤ 1, the mode is at the point 0 and when α > 1 the mode isat the point [− logθ α] − 1, [− logθ α], or [− logθ α] + 1.

Proof. If the random variable T follows the exponentiated-exponential distribution with parametersα and − log θ (0 < θ < 1), then Lemma 3 shows that the random variable X = [T ] follows the EEGDand hence Theorem2 implies that the EEGD is unimodal. Ifα ≤ 1, it follows from Lemma 3 thatm = 0is the mode. For α > 1, the exponentiated-exponential distribution with parameters α and − log θhas a unique modal point at m = − logθ α by solving the equation r ′(t) = 0. Hence, by Corollary 1,the mode of EEGD is at the point [− logθ α] − 1, [− logθ α] or [− logθ α] + 1. �

The formulas for calculating the moments of the EEGD are given in the following theorem.

Theorem 5. If X is a random variable following the EEGD, then the rth moment can be obtained by usingthe formula

E(X r) =

∞k=1

(−1)kα(k)

k!(θ k

− 1)Li−r(θk), (3.3)

where α(k)= α(α −1)(α−2) · · · (α −k+1) and Lis(u) is the polylogarithm function, (see [16]), defined

as

Lis(u) =

∞i=1

ui/is, 0 < u < 1. (3.4)

If α is a positive integer, then the summation in (3.3) will end at α.

Proof. By using the generalized binomial expansion, g(x) in (3.2) can be written as

g(x) =

∞k=1

α(k)

k!(−1)k(θ k

− 1)θ kx. (3.5)

The rth moment of the EEGD is

E(X r) =

∞x=0

xrg(x) =

∞x=0

xr∞k=1

α(k)

k!(−1)k(θ k

− 1)θ kx

=

∞k=1

α(k)

k!(−1)k(θ k

− 1)∞x=1

xr(θ k)x. (3.6)

By using the polylogarithm function in (3.4), the result in (3.6) reduces to

E(X r) =

∞k=1

α(k)

k!(−1)k(θ k

− 1)Li−r(θk). �

Author's personal copy

596 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

By using (3.3), the first two moments of the EEGD are given by

E(X) =

∞k=1

(−1)kα(k)

k!θ k

θ k − 1, (3.7)

and

E(X2) =

∞k=1

(−1)k+1 α(k)

k!θ k(1 + θ k)

(1 − θ k)2. (3.8)

The variance of the EEGD is given by

σ 2= E(X2) − (E(X))2 =

∞k=1

(−1)k+1 α(k)

k!θ k(1 + θ k)

(1 − θ k)2−

∞k=1

(−1)kα(k)

k!θ k

θ k − 1

2

. (3.9)

Table 1 provides the mean and the variance for the EEGD when α = 1, 2, and 3.

Table 1The mean and variance of EEGD when α = 1, 2, 3.

α E(X) σ 2

1 θ(1 − θ)−1 θ(1 − θ)−2

2 θ(θ + 2)(1 − θ2)−1 θ(2θ2+ θ + 2)(1 − θ2)−2

3 θ(θ3+ 4θ2

+ 3θ + 3)(1 + θ)−1(1 − θ3)−1 θ(3θ6+3θ5

+13θ4+11θ3

+13θ2+3θ+3)(1+θ2)−2(1−θ3)−2

Theorem 6. If X is a random variable that follows the EEGD, then the probability generating function isgiven by

ϕ(t) =

∞k=1

(−1)kα(k)

k!θ k

− 11 − tθ k

, |t| < θ−1. (3.10)

Proof. By using (3.5), the probability generating function ϕ(t), can be written as

E(tX ) =

∞k=1

(−1)kα(k)

k!(θ k

− 1)∞x=0

(tθ k)x =

∞k=1

(−1)kα(k)

k!θ k

− 11 − tθ k

, |t| < θ−1. �

Corollary 2. The descending factorial moment E(X (m)), where E(X (m)) = E[X(X − 1)(X − 2) · · · (X −

m + 1)], for the EEGD is given by

E(X (m)) =

∞k=1

(−1)k+1 α(k)

k!m!θmk

(1 − θ k)m. (3.11)

Proof. The descending factorial moments can be found by using the fact that

E(X (m)) =dmϕ(t)dtm

t=1

.

On using (3.10),

dmϕ(t)dtm

=

∞k=1

(−1)kα(k)

k!m!(θ k

− 1)θmk

(1 − tθ k)m+1. (3.12)

The result follows by substituting t = 1 in (3.12). �

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 597

Over-dispersion relative to the Poisson distribution is a situation in which the variance exceedsthemean and under-dispersion is the opposite. The EEGD satisfies the over-dispersion property whenα ≤ 2. However, for values of α > 2, the distribution can be over-dispersed, under-dispersed or equi-dispersed. Since the mean and variance of EEGD are expressed in terms of infinite series, a numericalmethod is used to determine the points where the distribution is equi-dispersed. A non-linear cubicfunction relating α to θ is obtained for the situation when the distribution is equi-dispersed. Fig. 2shows the regions in which the EEGD is over-dispersed and under-dispersed. The non-linear curve inFig. 2 connects the points where EEGD is equi-dispersed.

Fig. 2. Dispersion regions for the EEGD.

Table 2 provides the mean, variance, skewness, kurtosis and mode of the EEGD for various valuesof α and θ . For fixed α, the mean, the variance and the mode of the EEGD are increasing functionsof θ . Also, for fixed θ , the mean, the variance and the mode of the EEGD are increasing functions ofα while the skewness and the kurtosis are decreasing functions of α. The skewness of the EEGD isalways positive.

The hazard function associated with the EEGD is

hg(x) =g(x)

1 − G(x)=

1 − (1 − θ x)α

1 −1 − θ x+1

α − 1. (3.13)

By using the L’Hôpital’s rule, it is not difficult to show that limx→∞ hg(x) = θ−1− 1. This shows that

the hazard function of the EEGD has an asymptotic horizontal line at y = θ−1− 1.

The following theorem gives sufficient conditions for the hazard function in (3.13) to be an IFR, aDFR and a constant failure rate (CFR).

Theorem 7. The EEGD has an IFR when α > 1, a DFR when α < 1, and a CFR when α = 1.

Proof. The exponentiated-exponential distribution has an IFRwhen α > 1 and a DFRwhen α < 1 [7]and hence Theorem 3 implies that when α > 1 (or α < 1), the EEGD has an IFR (or a DFR). For α = 1,the EEGD hazard function in (3.13) reduces to hg(x) = θ−1

− 1 which is a constant. �

4. Parameter estimation for EEGD

Themaximum likelihoodmethod is applied to estimate the EEGDparameters. Let a random sampleof size n be taken from the EEGD and let the observed frequencies be denoted by nx, x = 0, 1, 2, . . . , k,where Σk

x=0nx = n. The log-likelihood function of the EEGD in (3.2) can be written as

log L = n0α log(1 − θ) +

kx=1

nx log(1 − θ x+1)α − (1 − θ x)α

. (4.1)

Author's personal copy

598 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

Table 2Mean, variance, skewness, kurtosis and mode of the EEGD for some values ofα and θ .

α θ Mean Variance Skewness Kurtosis Mode

0.5

0.1 0.0569 0.0660 5.1088 34.1840 00.25 0.1762 0.2574 3.6969 20.9665 00.5 0.5546 1.2778 3.0319 15.8103 00.75 1.7438 8.2082 2.7550 13.8730 00.9 5.3921 63.2830 2.6632 13.2743 0

1

0.1 0.1111 0.1235 3.4785 17.1000 00.25 0.3333 0.4444 2.5000 11.2500 00.5 1 2 2.1213 9.5000 00.75 3 12 2.0207 9.0833 00.9 9 90 2.0028 9.0111 0

2

0.1 0.2121 0.2163 2.2755 8.7151 00.25 0.6 0.6756 1.6521 6.8071 00.5 1.6667 2.6667 1.5650 6.8750 10.75 4.7143 15.1837 1.5977 7.0367 20.9 13.7368 112.6870 1.6082 7.0740 6

3

0.1 0.3040 0.2851 1.7148 6.0701 00.25 0.8159 0.7907 1.3088 5.6787 00.5 2.1429 2.9252 1.3935 6.2587 10.75 5.8726 16.5320 1.4520 6.4451 30.9 16.9006 122.6970 1.4621 6.4763 10

4

0.1 0.3877 0.3349 1.3690 4.8686 00.25 0.9929 0.8446 1.1456 5.3392 10.5 2.5048 3.0520 1.3232 6.0114 20.75 6.7418 17.2851 1.3766 6.1629 40.9 19.2734 128.3269 1.3853 6.1895 13

5

0.1 0.4647 0.3701 1.1288 4.2509 00.25 1.1401 0.8674 1.0725 5.2673 10.5 2.7942 3.1304 1.2843 5.8676 20.75 7.4370 17.7680 1.3298 5.9977 50.9 21.1716 131.9302 1.3380 6.0222 15

The derivatives of (4.1) with respect to α and θ are, respectively, given by

∂ log L∂α

= n0 log(1 − θ) +

kx=1

[(1 − θ x+1)α log(1 − θ x+1) − (1 − θ x)α log(1 − θ x)]nx

(1 − θ x+1)α − (1 − θ x)α, (4.2)

∂ log L∂θ

=−n0α

1 − θ+

kx=1

[x(1 − θ x)α−1− θ(x + 1)(1 − θ x+1)α−1

]αθ x−1nx

(1 − θ x+1)α − (1 − θ x)α. (4.3)

The maximum likelihood estimates α and θ can be found by setting (4.2) and (4.3) to zero and thensolving the equations by an iterative technique.

The initial estimate used for the parameter θ is the moment estimate of θ by considering the datato be from the geometric distribution. Thus, the initial value of θ is θ = x/(1+ x). The initial estimatefor α is determined by equating the ratio of ‘‘one frequency’’ to ‘‘zero frequency’’ in the sample, n1/n0,to the corresponding population ratio, g(1)/g(0) = (1 + θ)α − 1, and solving for α. The equation isgiven by n1/n0 = (1 + θ)α − 1, which can be solved for α as α = log(n1/n0 + 1)/ log(θ + 1). Onreplacing θ by θ , the initial estimate for α is α = log(n1/n0 + 1)/ log(θ + 1).

5. Tests to compare EEGD with geometric distribution

The EEGD reduces to the geometric distribution when α = 1. Thus, to compare the EEGD with thegeometric distribution, we test the null hypothesis

H0 : α = 1 against H1 : α = 1. (5.1)

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 599

To test the null hypothesis in (5.1), one can use the likelihood ratio test, the Wald test, or the scoretest. The likelihood ratio statistic for testing the null hypothesis in (5.1) is based on

λ = L0(θ)/L1(α, θ ), (5.2)

where L0 and L1 are the likelihood functions for the geometric distribution and the EEGD respectively.The likelihood ratio statistic is defined as −2 log λ, which has an approximate chi-square distributionwith 1 degree of freedom. The Wald statistic for testing the null hypothesis in (5.1) is defined as

W = (α − 1)/se(α), (5.3)

where α is the MLE for the EEGD and se(α) is the standard error of α.W has an approximate standardnormal distribution. The score statistic [6] for testing the null hypothesis in (5.1) is given by

S = V T I−1V , (5.4)

where the score vector V =

∂ l/∂α∂ l/∂θ

and the information matrix I is the 2 × 2 matrix with entries

Ii j = −E∂2l/∂bi∂bj

, where b1 = α, b2 = θ and l = log L is the log-likelihood function of the

EEGD in (4.1). Both V and I in (5.4) are evaluated under the null hypothesis H0, that is at α = 1 andθ = x/(1+x)which is theMLE for θ in geometric distribution. The score statistic S has an approximatechi-square distribution with 1 degree of freedom.

A simulation study is conducted to compare the performance of the three test statistics for thehypothesis in (5.1). We consider the values 0.4, 0.6, 0.8, 1, 1.2, 1.4 and 2 for the parameter α and thevalues 0.25, 0.5 and 0.75 for the parameter θ . Three different sample sizes n = 100, 300 and 500, areconsidered. For each parameter combination, we generate a random sample from the EEGD and thenconduct the three tests at 5% and 10% levels of significance. The simulation is repeated 1000 times foreach set of parameter combination in order to calculate the proportion of times the null hypothesisH0 is rejected. The proportion of times that H0 is rejected is used to estimate the power of the test.The proportions for the 5% level of significance are reported in Table 3. The results for the 10% level ofsignificance are similar and they are not reported in order to conserve space. When α = 1, the EEGDreduces to the geometric distribution which is the distribution under the null hypothesis.

From the results in Table 3, we observe that for fixed α the power of the tests is an increasingfunction of θ . When α > 1, the likelihood ratio test is the most powerful followed by the score test.When α < 1, the Wald and score tests are more powerful than the likelihood ratio test. When α = 1,the proportion of times that H0 is rejected is expected to be close to the nominal level (5%). As nincreases, the score test appears to be closer to the nominal level (5%) than the likelihood ratio andthe Wald tests.

6. Applications of EEGD

In this section, the EEGD is applied to three data sets. The first data set in Table 4 is from[4, p. 120] and it represents the observed frequencies of the number of outbreaks of strike in a coal-mining industry in the UK during 1948–1959. The second data set from [8], in Table 5, represents theobserved frequencies of the number of absences among shift-workers in a steel industry. The thirddata set, in Table 6, from [17, p. 135] represents the observed frequencies of the number of claimson automobile insurance policies in Australia. Tables 4–6 also contain the results of fitting these datasets to the EEGD, the generalized Poisson distribution (GPD) [4], and either the negative binomialdistribution (NBD) or the binomial distribution (BD). If the data exhibits over-dispersion, the NBD isappliedwhile the BD is appliedwhen the data is under-dispersed. Themethod ofmaximum likelihoodestimation is used to estimate the model parameters.

The data in Table 4 was originally analyzed by Kendall [14], who considered the number ofoutbreaks in 4-week period for four industries (coal-mining, vehicle manufacture, ship building, andtransport) during 1948–1959. Kendall [14] fitted the data to the Poison distribution and concludedthat the aggregate data for the four industries agree with Poisson law but does not hold closely forindividual industries. Consul [4] in his book, fitted the data for the four industries to the GPD and

Author's personal copy

600 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

Table 3The proportion of times (out of 1000) that the null hypothesis is rejected at 5% level.

Parameter value n = 100 n = 300 n = 500α θ LR* Wald Score LR Wald Score LR Wald Score

0.40.25 0.243 0.520 0.107 0.533 0.736 0.228 0.752 0.868 0.3070.50 0.688 0.843 0.439 0.991 0.997 0.557 1.000 1.000 0.5340.75 0.987 0.994 0.844 1.000 1.000 0.946 1.000 1.000 0.975

0.60.25 0.117 0.293 0.136 0.304 0.491 0.424 0.496 0.628 0.5840.50 0.370 0.525 0.502 0.846 0.893 0.902 0.967 0.980 0.9820.75 0.766 0.838 0.844 0.996 0.997 0.997 1.000 1.000 1.000

0.80.25 0.060 0.147 0.122 0.117 0.210 0.224 0.147 0.240 0.2580.50 0.123 0.213 0.234 0.306 0.388 0.401 0.457 0.528 0.5330.75 0.240 0.316 0.330 0.566 0.627 0.632 0.777 0.809 0.811

1.00.25 0.048 0.070 0.087 0.051 0.054 0.067 0.055 0.042 0.0510.50 0.063 0.048 0.066 0.048 0.041 0.045 0.058 0.054 0.0540.75 0.049 0.047 0.052 0.061 0.053 0.053 0.047 0.050 0.050

1.20.25 0.086 0.021 0.032 0.130 0.028 0.053 0.173 0.071 0.0850.50 0.144 0.048 0.065 0.301 0.205 0.216 0.433 0.362 0.3730.75 0.221 0.126 0.137 0.503 0.442 0.447 0.707 0.657 0.659

1.40.25 0.174 0.005 0.037 0.394 0.130 0.191 0.539 0.325 0.3620.50 0.360 0.151 0.190 0.793 0.701 0.713 0.927 0.904 0.9040.75 0.565 0.408 0.421 0.942 0.923 0.923 0.994 0.994 0.994

2.00.25 0.599 0.094 0.294 0.966 0.911 0.926 0.997 0.996 0.9960.50 0.955 0.873 0.891 0.999 0.999 0.999 1.000 1.000 1.0000.75 0.992 0.988 0.989 1.000 1.000 1.000 1.000 1.000 1.000

* LR = Likelihood ratio.

Table 4The number of outbreaks of strike in the coal-mining industry in UK.

x-value Observed ExpectedEEGD BD GPD

0 46 46.00 50.44 50.011 76 75.66 65.09 65.772 24 26.02 32.27 32.233 9

1

6.41 7.45 7.23

≥4 1.90 0.75 0.76

Total 156 156 156 156

Parameter estimates α = 4.7984 r = 4.3191 λ = −0.1450θ = 0.2247 p = 0.2300 θ = 1.1377

χ2 0.50 4.7340 4.5141df 1 1 1p-value 0.4785 0.0296 0.0336

the results showed that the GPD gives adequate fit for the vehicle manufacture, ship building, andtransport industries but did not provide adequate fit to the data from the coal-mining industry. In thispaper we fit the EEGD to all data sets, but provide the results for the coal-mining industry data. TheEEGD provides good fit to all data sets. The results in Table 4 indicate that only the EEGD provides agood fit to the data from the coal-mining industry. The quantity (1 − θ )α can be used to estimate theprobability of having no strike. From Table 4, (1 − 0.2247)4.7984 ≃ 0.2949 and hence, the estimatedprobability of having no strike in the coal-mining industry is 0.2949. These data sets indicate that theEEGD can be used to model the number of outbreaks of strike in industries.

The data in Table 5 was originally studied by Arbous and Sichel [2] in an attempt to create a modelthat can describe the distribution of absences to a group of people in single- and double-exposureperiod. The original data contains the number of absences, x-value, of 248 shift workers in the twoyears 1947 and 1948. Arbous and Sichel [2] used the negative binomial distribution tomodel the data.

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 601

Table 5The number of absences among shift-workers in a steel industry.

x-value Observed ExpectedEEGD NBD GPD

0 7 9.75 11.86 9.181 16 16.33 16.06 15.952 23 18.47 17.72 19.233 20 18.87 18.07 20.084 23 18.36 17.67 19.545 24 17.37 16.84 18.276 12 16.13 15.76 16.697 13 14.79 14.57 15.028 9 13.43 13.33 13.409 9 12.11 12.11 11.88

10 8 10.86 10.93 10.4911 10 9.70 9.82 9.2412 8 8.63 8.78 8.1313 7 7.65 7.82 7.1514 2

12

6.77 6.95 6.28

15 5.98 6.16 5.5216 3

5

5.27 5.44 4.86

17 4.64 4.80 4.2718 4

2

4.08 4.23 3.76

19 3.59 3.71 3.3120 2

5

3.15 3.26 2.92

21 2.76 2.86 2.5722 5

21

2.42 2.50 2.27

23 2.12 2.19 2.0124 1.86 1.91 1.7725–48 16 12.91 12.67 14.19

Total 248 248 248 248

Parameter estimates α = 1.5686 r = 1.5885 λ = 0.6410θ = 0.8730 p = 0.8525 θ = 3.2960

χ2 12.25 14.97 10.04df 17 17 17p-value 0.7849 0.5977 0.9020

Gupta and Ong [8] proposed a four-parameter generalized negative binomial distribution to fit thedata and compared it to the NBD and GPD. The chi-square value for their distribution was 8.27 with15 degrees of freedom. The chi-square value obtained byGupta andOng for the GPDwas 27.79with 17degrees of freedom. This chi-square value is much larger than the 10.04 provided for GPD in Table 5.The GPD parameter estimates in Table 5 are slightly different from the values provided by Gupta andOng [8]. The results from Table 5 indicate that the two-parameter EEGD and GPD provide a good fit tothe data.

The data, in Table 6, from [17, p. 135] represents the observed frequencies of the number of claimson automobile insurance policies in Australia. Klugman et al. [17] used the relationship between thepopulation mean, variance and third central moment for the Poisson-binomial, negative binomial,Polya–Aeppli, Neyman Type A and Poisson-extended truncated negative binomial (Poisson-ETNB)distributions to show that only the Poisson-ETNB distribution is appropriate to model the data. About88% of the x-values in Table 6 are zeros and one might expect a zero-inflated model to provide anadequate fit. To check if a zero-inflated distributionwill fit better than non-zero inflated distributions,we fit a three-parameter zero-inflated GPD to this data and noticed that the results from a three-parameter zero-inflated GPD were worse than what we got from the ordinary GPD.

Table 6 shows the results of fitting this data to EEGD, NBD and GPD. Only the EEGD fit the datawhile the NBD and the GPD did not provide an adequate fit. The shape of the distribution of the data isa reversed J-shape which shows that the EEGD performs very well in modeling reversed J-shape data.

Author's personal copy

602 A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603

Table 6The number of claims on automobile insurance policies in Australia.

x-value Observed ExpectedEEGD NBD GPD

0 565664 565665.40 565708.60 565710.101 68714 68722.87 68569.62 68569.912 5177 5153.58 5317.22 5312.383 365 378.24 334.93 337.384 24

60

27.72 18.66 19.19

5 2.03 0.96 1.026 0.17 0.04 0.05

Total 639950 639950 639950 639950

Parameter estimates α = 1.6215 r = 3.5777 λ = 0.0172θ = 0.0733 p = 0.0339 θ = 0.1233

χ2 0.57 12.14 10.70df 2 2 2p-value 0.7515 0.0023 0.0047

The three-parameter Poisson-ETNB suggested by Klugman et al. [17] also provides adequate fit withp-value of 0.5895 from the chi-square statistic of 0.29.

In summary, the EEGD provides very good fit to the three data sets. The EEGD fits well thedistributions of the number of strikes, claims, or absences in industries. The graphs of EEGD displayedin Fig. 1 show that this distribution is very flexible and can fit very well a wide range of discrete datasets. Also, the results from Tables 4–6 show that the expected frequency from EEGD for x = 0 is veryclose to the observed frequencywhen compared to GPD, NBD or BD. Furthermore, the simplicity of thepmf of EEGD and the fact that EEGD has a closed form CDF add an extra advantage to the distribution.

7. Summary

A method to generate new families of discrete distributions is introduced. Some examples ofdiscrete distributions were provided. Attention is devoted to a special class of the T -X families ofdiscrete distributions when X is the geometric distribution. Some properties of the T -geometricfamily of distributions are obtained. A member of the T -geometric family of discrete distributions,the exponentiated-exponential–geometric distribution, is defined. Various properties of the EEGD arestudied, including the moments and the probability generating function. Three tests for testing thedistribution fit between geometric and EEGD are compared. The simulation study suggests that whenα > 1, the likelihood ratio test is themost powerful followed by the score test.When α < 1, theWaldand score tests are more powerful than the likelihood ratio test. Three real data sets are fitted to theEEGD and the results are compared with GPD, NBD or BD. The results show that the EEGD gives a verygood fit to each data set.

Acknowledgments

The authors are grateful for the comments and suggestions by the referees and the Editor-in-Chief.Their comments and suggestions have greatly improved the paper.

References

[1] A. Alzaatreh, C. Lee, F. Famoye, A new method for generating families of continuous distributions, 2011 (submitted forpublication).

[2] A.G. Arbous, H.S. Sichel, New techniques for the analysis of absenteeism data, Biometrika 41 (1954) 77–90.[3] N. Balakrishnan, V.B. Nevzorov, A Primer on Statistical Distributions, John Wiley & Sons, Inc., Hoboken, NJ, 2003.[4] P.C. Consul, Generalized Poisson Distributions: Properties and Applications, Marcel Dekker, Inc., New York, NY, 1989.[5] P.C. Consul, F. Famoye, Lagrangian Probability Distributions, Birkhäuser, Boston, Massachusetts, 2006.[6] D.R. Cox, D.V. Hinkley, Theoretical Statistics, Chapman & Hall, Inc., London, 1974.

Author's personal copy

A. Alzaatreh et al. / Statistical Methodology 9 (2012) 589–603 603

[7] R.D. Gupta, D. Kundu, Exponentiated-exponential family: an alternative to gamma andWeibull distributions, BiometricalJournal 43 (2001) 117–130.

[8] R.C. Gupta, S.D. Ong, A new generalization of the negative binomial distribution, Computational Statistics & Data Analysis45 (2004) 287–300.

[9] S. Inusah, T.J. Kozubowski, A discrete analogue of the Laplace distribution, Journal of Statistical Planning and Inference 136(2006) 1090–1102.

[10] N.L. Johnson, A.W. Kemp, S. Kotz, Univariate Discrete Distributions, third ed., John Wiley & Sons, Inc., Hoboken, NJ, 2005.[11] L. Katz, Characteristics of frequency functions defined by first order difference equations, Ph.D. Thesis, University of

Michigan, Ann Arbor, MI, 1945.[12] A.W. Kemp, A wide class of discrete distributions and their associated differential equations, Sankhya Series A 30 (1968)

401–410.[13] A.W. Kemp, Characterization of discrete normal distribution, Journal of Statistical Planning and Inference 63 (1997)

223–229.[14] M.G. Kendall, Natural law in the social sciences, Journal of the Royal Statistical Society. Series A (General) 124 (1961) 1–19.[15] A.M.S. Khan, A. Khalique, A.M. Abouammoh, On estimating parameters in a discreteWeibull distribution, IEEE Transactions

on Reliability 38 (3) (1989) 348–350.[16] A.N. Kirillov, Dilogarithm identities, Progress of Theoretical Physics Supplement 118 (1995) 61–142.[17] S.A. Klugman, H.H. Panjer, G.E. Willmot, Loss Models: From Data to Decisions, third ed., John Wiley & Sons, Inc., Hoboken,

NJ, 2008.[18] H. Krishna, P.S. Pundir, Discrete Burr and discrete Pareto distributions, Statistical Methodology 6 (2009) 177–188.[19] K.B. Kulasekera, Approximate MLE’s of the parameters of a discrete Weibull distribution with type I censored data,

Microelectronics and Reliability 34 (7) (1994) 1185–1188.[20] S. Li, F. Famoye, C. Lee, On the generalized Lagrangian probability distributions, Journal of Probability and Statistical Science

8 (1) (2010) 113–123.[21] T. Nakagawa, S. Osaki, The discrete Weibull distribution, IEEE Transactions on Reliability 24 (5) (1975) 300–301.[22] A.H.R. Roknabadi, G.R.M. Borzadaran, M. Khorashadizadeh, Some aspects of discrete hazard rate function in telescopic

families, Economic Quality Control 24 (1) (2009) 35–42.[23] D. Roy, Reliability measures in the discrete bivariate set-up and related characterization results for a bivariate geometric

distribution, Journal of Multivariate Analysis 46 (1993) 362–373.[24] D. Roy, The discrete normal distribution, Communications in Statistics-Theory and Methods 32 (2003) 1871–1883.[25] D. Roy, Discrete Rayleigh distribution, IEEE Transactions on Reliability 53 (2) (2004) 255–260.[26] D. Roy, T. Dasgupta, A discretizing approach for evaluating reliability of complex systems under stress–strength model,

IEEE Transactions on Reliability 50 (2) (2001) 145–150.[27] E.S. Stein, R. Dattero, A new discrete Weibull distribution, IEEE Transactions on Reliability 33 (2) (1984) 196–197.