Home A New Method for Generalizing Burr and Related Distributions
Article Open Access

A New Method for Generalizing Burr and Related Distributions

  • Tanujit Chakraborty EMAIL logo , Suchismita Das EMAIL logo and Swarup Chattopadhyay EMAIL logo
Published/Copyright: February 16, 2022
Become an author with De Gruyter Brill

Abstract

A new method has been proposed to generalize Burr-XII distribution, also called Burr distribution, by adding an extra parameter to an existing Burr distribution for more flexibility. In this method, the exponent of the Burr distribution is modeled using a nonlinear function of the data and one additional parameter. The models of this newly introduced generalized Burr family can significantly increase the flexibility of the former Burr distribution with respect to the density and hazard rate shapes. Families expanded using the method proposed here is heavy-tailed and belongs to the maximum domain of attractions of the Frechet distribution. The method is further applied to yield three-parameter classical Pareto and generalized exponentiated distributions which shows the broader application of the proposed idea of generalization. A relevant model of the new generalized Burr family has been considered in detail, with particular emphasis on the hazard functions, stochastic orders, estimation procedures, and testing methods are derived. Finally, as empirical evidence, the new distribution is applied to the analysis of large-scale heavy-tailed network data and compared with other commonly used distributions available for fitting degree distributions of networks. Experimental results suggest that the proposed Burr distribution with nonlinear exponent better fits the large-scale heavy-tailed networks better than the popularly used Marhsall-Olkin generalization of Burr and exponentiated Burr distributions.

MSC 2010: Primary 60E05

1. Introduction

The development of statistical distributions is one of the oldest research topic in the field of statistics. There has been a renewed interest in developing more flexible statistical distributions in recent decades. Since the seminal work by Karl Pearson in 1895 [37], several general methods have been developed for generating family of distributions. Pearson presented a systematic approach for generating statistical distribution to model non-symmetric type of data using differential equation. The Pearson system of continuous distributions is a system for which every probability density function (PDF) f(x) satisfies the differential equation of the following form:

(1.1)1f(x)df(x)dx=a+xb0+b1x+b2x2,

where a, b0, b1, and b2 are the shape parameters. Different types of distributions correspond to different forms of solution to Eqn. (1.1). The form of solution of Eqn. (1.1) depends on the root of the equation

b0+b1x+b2x2=0.

Pearson presented four types of distributions [14] and characterizations of Pearson type distributions are also available in the literature [4]. Irving W. Burr proposed another important development in this category. In Burr’s method [9], system of distributions satisfy the following differential equation:

(1.2)dF=F(1F)g(x)dx,

where 0 ≤ F ≤ 1 and g(x) is a non-negative function over x. Twelve different solutions to the Eqn. (1.2) in the form of cumulative distribution functions (CDF) were given and named as Burr Types I–XII distributions. The Burr type-XII distribution, being a member of the Burr system has gained more attention in the last decade due to its potential use in practical situations. The Burr Type XII, simply called Burr distribution, is highly useful for fitting heavy-tailed data sets from the field of reliability, economics, hydrology, actuarial science, and network science among many others [24]. Burr distribution also emerges as a suitable model to describe stationary states of complex and non-equilibrium systems [38, 39]. The main advantage of Burr distribution from the extreme value statistics’ point of view is that it has algebraic tails which are effective for modeling failures that occur with lesser frequency than with corresponding models based on exponential tails [41]. The CDF and PDF of the Burr distribution are defined as follows.

Definition 1.

A random variable X follows Burr distribution with parameters c, λ, α, if the CDF is of the following form:

(1.3)F(x;c,λ,α)=11+xλcα,x>0,c,α,λ>0,

where α and c are shape parameters and λ is a scale parameter. The density function of the Burr distribution is given by

(1.4)f(x;c,λ,α)=cαλcxc11+xλcα1,x>0,c,α,λ>0.

The corresponding survival function is given by

(1.5)S(x;c,λ,α)=1+xλcα,x>0.

The hazard rate function is given by

(1.6)h(x;c,λ,α)=f(x;c,λ,α)S(x;c,λ,α)=cαλcxc11+xλc1,x>0.

It is interesting to note that the CDF in (1.3) is regularly varying at infinity, viz. they satisfy for some γ > 0, called the tail-index, and for all t > 0,

limx1F(tx)1F(x)=tγ.

This result suggests that Burr distribution well-suited for modeling extreme values as a heavytailed distribution. The bivariate and multivariate extensions of Burr distribution are available in the past literature [13, 42]. Consequently, numerous modifications or generalizations of the Burr distribution, using different compounding and weighting techniques, have been suggested. See, for instance, the beta Burr distribution introduced by [36], the McDonald Burr distribution introduced by [17], order-statistics based generalized Burr distribution [5], the Marshall-Olkin Burr (MO Burr) distribution developed by [22], modified Burr distribution [21] the exponentiated Burr distribution introduced by [26] and the log-Weibull Burr distribution developed by [1]. Though all these modifications and generalizations resulted well in specific data analysis but all of these assumed constant exponents for the Burr distribution which causes potential failure when working with large-scale heavy-tailed data sets from various applied domains [24]. In this work, the primary hypothesis is that the exponent α of the Burr distribution is not a constant and varies according to a nonlinear function g which depends on the data.

This article aims to introduce an extra parameter to Burr, Pareto, and exponentiated distributions to bring more flexibility to the given families when dealing with real-world large-scale heavy-tailed data sets. We introduce a new method of generalization, namely the shape-parameter transformation (SPT) method, where the non-negative shape parameter is assumed to be expressible as a nonlinear function of the empirical data and adds an additional parameter to the base distribution. The proposed SPT method is straightforward to use; hence it can effectively be used for data analysis purposes. The proposed SPT method is first applied to a Burr distribution to yield generalized Burr (GBurr) family of distributions. The newly introduced GBurr family has some excellent statistical properties and belongs to the maximum domain of attractions of the Frechet distribution. The method presented in this paper adds an additional parameter in the distribution and can be a competitor to the popularly used Marshall-Olkin (MO) [30] and Lehmann alternatives-based [27] method of generalizing continuous distributions. The SPT method is further applied to the Pareto and the exponentiated family of distributions which shows the broader applicability of the proposed SPT method. Further, a specific nonlinear variant of the GBurr family, we call it NBurr distribution, is discussed in detail. Complementary theoretical aspects are studied, such as shapes, asymptotes, quantiles, stochastic ordering, reliability parameter, and inferential statistics. The application of the proposed NBurr distribution is shown using large-scale heavy-tailed network data sets from various disciplines.

The rest of the paper is organized as follows. We discuss the SPT method and its application to the Burr, Pareto, and exponentiated family of distributions in Section 2 along with some structural properties. In Section 3, we study a simple model from the GBurr family of distributions and discuss its stochastic and inferential characteristics. The empirical application of this new NBurr distribution to various large scale heavy-tailed network data sets is presented in Section 4. Finally, we conclude this paper in Section 5.

2. Shape parameter transformation (SPT) method

In this section, we introduce the SPT method to generalize Burr, Pareto, and exponentiated distributions by modeling the shape parameter of these families as a nonlinear function of the data. It also adds an additional parameter to the families as mentioned above of distributions for flexible modeling of real-life data sets. The SPT method is first applied to yield generalized Burr distributions which are regularly varying distributions at infinity and are heavy-tailed. Furthermore, this new SPT method is applied to Pareto distribution and exponentiated family of distributions to generalize these families.

2.1. The generalized Burr (GBurr) family

Definition 2.

A continuous random variable X follows generalized Burr (Burr) family of distributions if and only if it has the following CDF:

(2.1)FGBurr(x;c,λ,α,β)=11+xλcgxλ,α,β,x0,α,λ,c>0,

and F(x) = 0 if x < 0, where the real valued continuous, positive function g: (0, ∞) → ℝ+ is differentiable on (0, ∞). The shape parameter α of the Burr distribution is replaced with g(z) = g(x/λ; α, β) = g(x/λ), say, where β is an additional shape parameter and g(z) satisfies the following conditions:

  1. The function g(z) is strictly positive and have finite limit at infinity, viz.

    (2.2)limzg(z)=α(>0).
  2. limz0+1+zcg(z)=1 and limz1+zcg(z)=.

  3. g(z)g(z)czc11+zclog(1+z), z > 0, where g(z)=ddz[g(z)].

It is very easy to verify that:

  1. FGBurr(x) is non-decreasing since,

    cxλc1gxλ+gxλ1+xλclog1+xλ>0,x>0.
  2. limxFGBurr(x)=FGBurr()=0, limxFGBurr(x)=1.

  3. FGBurr(x) is right continuous.

Thus, (2.1) is a standard CDF and it can also be expressed as follows:

(2.3)FGBurr(x)=1expgxλlog1+xλc,x>0.

The corresponding survival function is given by

(2.4)SGBurr(x)=1+xλcg(x/λ),x>01,x0.

The probability density function is

(2.5)fGBurr(x)=1λcxλc11+xλcgxλ+gxλlog1+xλc1+xλcg(x/λ),x>00,x0.

The hazard rate function is given by

(2.6)hGBurr(x)=cλxλc11+xλcgxλ+1λgxλlog1+xλc,x>00,x0.

The CDF (2.1) of the GBurr family of distributions is a function with regularly varying tails and belongs to the maximum domain of attraction (MDA) of the Frechet distribution with index α > 0, viz. FMDAα), with Φα = exp{−xα}, x > 0, α > 0 [15].

Theorem 2.1.

GBurr family of distributions are:

  1. heavy-tailed,

  2. right tail-equivalent to a Pareto distribution, a

  3. belongs to the MDA of the Frechet distribution.

Proof.

  1. For the GBurr family of distributions, we note that

    limxexp{kx}(1F(x))=limxexpkxgxλlog1+xcλc=,

    where k, λ, c > 0.

  2. To show the tail-equivalent property, we show

    limx1F(x)1G(x)=limxexpgxλlog1+xcλcexpαlog1+xλ=1, for all λ,c>0,

    where G(x) is the CDF of the Pareto Type-II distribution.

  3. It should be noted that any function g as defined in Definition 2 satisfying limzg(z)=α>0, is slowly varying at infinity:

    limzg(tz)g(z)=1, for all t>0.

Now,

limx1F(tx)1F(x)=limx1+(tx)cλcgtxλ1+xcλcgxλ=tα, for all t>0 and λ,c>0.

It can be seen that Burr distribution belongs to this GBurr family and corresponds to the simplest choice g(z) = α. We give some examples of nonlinear g(z) satisfying the condition given in Definition 2 and present some distributions belonging to the GBurr family in Table 1. In order to select a model, one can choose the nonlinear exponent g(z) that meets the empirical characteristics of the given data sets. Obviously, this example list of GBurr models can further be expanded to more models by introducing some other forms of g(z) satisfying limzg(z)=α>0 or by increasing the number of parameters in the function g.

Table 1.

Some examples of GBurr family of distributions.

g(z) for all z > 0βf(x) for all x > 0
αβz+1βα[βλ(x+λ)2log(1+(xλ)c)+(αβλx+λ)cxc1xc+λc][1+(xλ)c](αβ1+x/λ)
αz1+zββ > −1αλxx+λββλ2x(x+λ)+λcxc1xc+λcexpαlog1+xcλcxx+λβ
αlog(1+z)1+log(1+z)ββ > −1αλlog(1+x/λ)1+log(1+x/λ)ββlog1+xcλc(1+x/λ)[1+log(1+x/λ)][log(1+x/λ)]+λcxc1xc+λc×expαlogβ(1+x/λ)log1+xcλc[1+log(1+x/λ)]β

We can also obtain some popular size distributions as the particular cases of the GBurr family:

  • By taking the constant g function, viz., when g(z) = 1, then GBurr distributions reduces to the Fisk distribution [16].

  • For α → ∞ and β = 0, GBurr family of distributions converges to the Weibull distribution [25].

  • By taking c = 1 and β = 0, GBurr distribution reduces to the Lomax distribution [29].

2.2. Generalized classical Pareto distribution

In this section, we apply the SPT method to the classical Pareto distribution to yield generalized classical Pareto (GCP) distributions as defined below.

Definition 3.

A continuous random variable X follows GCP family of distributions if and only if it has the following CDF:

(2.7)FGCP(x)=1xγmxγ,xγ

and F(x) = 0 if xγ, γ (> 0) is a scale parameter, m: (1, ∞) → ℝ+ is a real, continuous, positive function which is differentiable on (1, ∞), and the function m satisfies the following conditions:

  1. m is strictly positive and have finite limit at infinity, viz.

    (2.8)limzm(z)=α(>0).
  2. limz1+zm(z)=1 and limzzm(z)=.

  3. m(z)m(z)>1zlog(z),z>1 .

It is noted that the condition (3) is equivalent to

ddzlog(g(z))ddzlog[log(z)],z>1.

It is very easy to verify that F(x) in (2.7) is a standard distribution function and any continuous random variable X satisfying the above-mentioned conditions are called as GCP distributions. An alternative expression for F(x) can also be given as follows:

FGCP(x)=1expmxγlogxγ,xγ.

The survival function is given by

(2.9)SGCP(x)=xγm if x>γ1 if xγ.

The PDF for the family of GCP distributions is given by

(2.10)fGCP(x)=mxγ+mxγxγlogxγSGCP(x)x if x>γ0 if xγ.

The hazard function is given by

(2.11)hGCP(x)=gxγ+mxγxγlogxγx if x>γ0 if xγ.

Table 2 shows some examples of GCP distributions satisfying condition (2.8) (i.e., limzm(z)=α>0). It is noted that the simplest choice m(z) = α leads the classical Pareto distribution. The rest of the models are completely new. In order to select a model, one can choose the nonlinear exponent m(z) that meets the empirical characteristics of the given data sets. Obviously, this example list of GCP models can further be expanded to more models by introducing some other forms of m(z) satisfying limzm(z)=α>0 or by increasing the number of parameters in the function g.

Table 2.

Some examples of GCP family of distributions with limzm(z)=α>0..

m(z) for all z > 1βf(x) for all x > γ
αβzβαxγα+βγ/xαx+γβxlog(x/γ)1x
αz1zββ > −1αxxγα(x/γ)1(x/γ)β(x/γ)1+βlog(x/γ)(x/γ)1(x/γ)1(x/γ)β
αlog(z)1+log(z)β > −1αxxγαlog(x/γ)log(x/γ)+1βlog(x/γ)+1+βlog(x/γ)+1log(x/γ)log(x/γ)+1β

2.3. Generalized exponentiated distributions

The idea of exponentiated family of distributions is based on Lehmann alternatives [27] that can add a non-negative shape parameter to any continuous probability distributions:

(2.12)F*(x)=[F(x)]α, where α>0 (real), 

where F(x) is a standard CDF. This new family F* is called ‘exponentiated family’ where one raises the CDF of an existing distribution to a power of an additional parameter. Lehmann motivated it in the following way: [F(x)]α is the distribution of the maximum of α independent and identically distributed variables with distribution F when α is an integer and when α is a rational number, [F(x)]α is one-parameter family of nonparametric class of alternatives. In the theory of generalized probability distributions, Lehmann alternatives has been widely used to generate so-called exponentiated Weibull family of distribution by [31]. A systematic treatment of exponentiated weibull, exponentiated gamma, and exponentiated pareto distribution are available in [18]. Gupta and Kundu studied exponentiated exponential distribution [20] and recently Gupta et al. [19] proposed Power Normal distributions using the same idea of Lehmann’s alternatives for standard normal distribution. Nadarajah and Kotz [32] studied a list of exponentiated X family of distributions, including exponentiated Gumbel, Fréchet, gamma, etc. distributions. We use our SPT method to generalize any ‘exponentiated family’ of distributions.

Definition 4.

Any continuous random variable X with distribution function F(x) follows generalized exponentiated distribution if and only if it has the following CDF

G(x)=[F(x)]m(x),<x<,

where m: (−∞, ∞) → (0, ∞) is a real, continuous function which is differentiable on (0, ∞). The function m satisfies the following conditions:

  1. m is strictly positive and have finite limit at infinity, viz.

    limxm(x)=α(>0).
  2. limxm(x)=β(>α) .

  3. m′(x) < 0.

It is important to note that when m(x) = α, then g(x) corresponds to the exponentiated family of distributions [18, 32]. We give some examples of m(x) to be used for the distributions with X ∈ ℝ.

Example 1.

Let m(x) be a function of the following form satisfying limxξ(x)=2 and limxξ(x)=0:

m(x)=(βα)2ξ(x)+α,β>α>0.

Some choices of ξ(x) are as follows:

(a)   ξ(x)=ex if x0,2ex if x<0,
(b)    ξ(x)=1x2k+1+1 if x0,2+1x2k+11 if x<0,
(c)    ξ(x)=1log(1+x)+1 if x0,21log(1x)+1 if x<0,

where k is any natural numbers.

Example 2.

Let m(x) be a function of the following form satisfying limxσ(x)=0 and limxσ(x)=1:

m(x)=ασ(x)+β(1σ(x)),β>α>0.

Some choices of σ(x) are as follows:

  • σ(x)=11+exx.

  • σ(x)=12+1πarctan(x),x.

  • σ(x)=12πxexpy2/2dy, x ∈ ℝ.

It is interesting to see that

G(x)=F(x)αΦ(x)+βΦ(x),x

is a standard CDF, where Φ(x) is a standard normal CDF. For all the above examples when α = β, we get G(x) = [F(x)]α.

3. NBurr distribution: Definition and properties

In this section, we study a relevant model of the GBurr family of distributions with the choice of g(z)=αβ1+z in Table 1. The CDF is given by

(3.1)FNBurr(x)=11+xλcαβ1+x/λ,x>0,

where α, λ, c > 0, β > −1 and α > β. We call this simple generalized form of Burr distribution as NBurr distribution (Burr distribution with nonlinear exponent). The NBurr distribution includes the Burr distribution when β = 0. The CDF in (3.1) can alternatively be written in the following form:

(3.2)FNBurr(x)=1expαβ1+xλlog1+xλc,x>0.

The corresponding survival function is given by,

(3.3)SNBurr(x)=1+xλcαβ1+x/λ,x>0.

The probability density function for x > 0 is given by

(3.4)fNBurr(x)=βλ(x+λ)2log1+xλc+αβλx+λcxc1xc+λcSNBurr(x),x>0,

and the hazard rate function is given by

(3.5)hNBurr(x)=βλ(x+λ)2log1+xλc+αβλx+λcxc1xc+λc,x>0.

The proposed NBurr distribution satisfies the extreme value properties as given in Theorem 2.1. Some graphics of the NBurr model derived from this newly introduced GBurr family of distributions are illustrated in Figure 1. Remark that NBurr distribution with parameters α, β, λ, c, as a distribution of the MDA of the Frechet distribution, satisfies the von Mises condition:

limxxfNBurr(x)SNBurr(x)=α>0.
FIGURE 1. Plots of PDFs of the NBurr distribution.
FIGURE 1.

Plots of PDFs of the NBurr distribution.

Next, we study the reliability properties and inferential properties of the new NBurr distribution in the next subsections.

3.1 Reliability properties of NBurr distribution

In this section, we study some reliability properties of the newly introduced NBurr distribution including monotonicity of hazard rates, stochastic orderings, entropy, etc. The following theorem shows that the hazard rate function of the NBurr distribution is increasing and decreasing under certain conditions on c and β.

Theorem 3.1.

The hazard rate function of the NBurr distribution satisfies the following properties:

  1. If 0 ≤ c ≤ 1 and β > 0 then for all x > 0, hNBurr(x) is decreasing in x;

  2. If c > 1 and β > 0, then for all x > 0,

    1. hNBurr(x) is increasing in x, whenc1>xλc;

    2. hNBurr(x) is decreasing in x, whenc1<xλc;

    3. hNBurr(x) is maximum at x = λ[c − 1]1/c.

Proof.

Differentiating (3.5) with respect to x, we have

hNBurr(x)=2βλ(x+λ)3log(1+xcλc)+2cβλxc1(x+λ)2(xc+λc)+(αβλx+λ)cxc2λc(xc+λc)2[c1xcλc]=2βλ(x+λ)3[cxc1x+λxc+λclog(1+xcλc)]+(αβλx+λ)cxc2(xc+λc)2[c1xcλc].

Now, αβλx+λ, cλcxc2xc+λc20 and if β > 0, 2βλ(x+λ)3xc+λc>0.

Thus, hNBurr(x) (≤)0 if

  1. A(x)=cxc1x+λxc+λclog1+xcλc()0;

  2. c1xcλc()0.

Now,

A(x)=cλcxc2(x+λ)xc+λc2c1xcλc.

We can see that A′(x) ≤ 0 if 0 ≤ c ≤ 1. Therefore, A(x) is decreasing in X and again A(0) = 0. Thus, we have A(x) ≤ 0.

Similarly, we can prove that if c > 1, then A(x) ≥ 0 when c1>xλc and A(x) ≤ 0 when c1<xλc. This implies that for 0 ≤ c ≤ 1, hNBurr(x) is decreasing in x; for c > 1, hNBurr(x) is increasing in x, when c1>xλc and hNBurr(x) is decreasing in x, when c1<xλc. Thus, the hazard function hNBurr(x) attain its maximum at X = λ [c − 1]1/c. □

The following example shows that the NBurr distribution does not preserves the likelihood ratio ordering. It is useful to recall that a random variable X is said to be larger than another random variable Y in likelihood ratio ordering (written as XLRY) if fX(x)/fY(x) is an increasing function for X > 0.

Example 3.

Let X and Y be two random variables following NBurr distributions with parameters α1, β1, λ1, c1and α2, β2, λ2, c2respectively. Then, for all x > 0, the ratio of the corresponding density functions of X and Y is given by

(3.6)fXNBurr (x)fYNBurr (x)=β1λ1x+λ12log1+xλ1c1+α1β1λ1x+λ1c1xc11xc1+λ1c11+xλ1c1α1β11+x/λ1β2λ2x+λ22log1+xλ2c2+α2β2λ2x+λ2c2xc21xc2+λ2c21+xλ2c2α2β21+x/λ2.

Now, for λ1 = λ2 = 1,

Case I, when c1 = c2 = 1, (3.6) reduces to

fXNBurr (x)fYNBurr (x)=β1(1+x)2log(1+x)+α1β11+x11+x(1+x)α1β11+xβ2(1+x)2log(1+x)+α2β21+x11+x(1+x)α2β21+x=P1(x), say

We can see that for α1 = 1.5, α2 = 2, β1 = 0.5 and β2 = 1.5, P1(0.1) = 1.366, P1(2) = 0.889 and P1(6) = 1.426, which implies that fXNBurr(x)fYNBurr(x) is not monotone. Thus XLRY.

Case II, when c1c2, (3.6) reduces to

fXNBurr (x)fYNBurr (x)=β1(1+x)2log1+xc1+α1β11+xc1xc111+xc11+xc1α1β11+xβ2(1+x)2log1+xc2+α2β21+xc2xc211+xc21+xc2α2β21+x=P2(x), say .

Again, we can see that for c1 = 2, c2 = 4, α1 = 1.5, α2 = 2, β1 = 0.5 and β2 = 1.5, P2(0.5) = 1.5753, P2(1) = 0.4843 and P2(2) = 2.8757, which implies that fXNBurr(x)fYNBurr(x) is not monotone. Thus XLRY. Hence, it can be concluded that the proposed NBurr distribution does not preserves the likelihood ratio ordering. □

In the next theorem, we show that the NBurr distribution preserves the usual stochastic ordering. It is useful to remind that a random variable X is said to be larger than another random variable Y in usual stochastic ordering (written as XSTY) if SX(x) ≥ SY(x), for all x > 0.

Theorem 3.2.

Let X and Y be two random variables following NBurr distribution with parameters α1, β1, λ1, c1and α2, β2, λ2, c2, respectively. Then, XST (≤ST)Y provided

  1. λ1 ≥ (≤)λ2,

  2. c1 ≤ (≥)c2,

  3. α1 ≤ (≥)α2, and

  4. β1 ≤ (≥)β2for β > 0.

Proof.

XST (≤ST)Y if and only if for all x > 0, SXNBurr(x) ≥ (≤)SYNBurr(x), which is equivalent to

1+xλ2c2α2β2λ2x+λ21+xλ1c1α1β1λ1x+λ1()1.

This holds if (i) λ1 ≥ (≤)λ2, (ii) c1 ≤ (≥)c2, (iii) α1 ≤ (≥)α2and (iv) β1 ≤ (≥)β2, for β > 0. □

The following theorem gives the condition under which the NBurr distribution preserves the hazard rate ordering. It is well known that a random variable X is said to be larger than another random variable Y in hazard rate ordering (written as XHRY) if, for all x > 0, hX(x) ≤ hY(x).

Theorem 3.3.

Let X and Y be two random variables following NBurr distribution with parameters α1, β1, λ1, c1and α2, β2, λ2, c2, respectively.Then, XHR (≤HR)Y provided

  1. λ1 ≥ (≤)λ2,

  2. c1 ≤ (≥)c2,

  3. α1 ≤ (≥)α2, and

  4. β1 ≤ (≥)β2for β > 0.

Proof.

XHR (≤HR)Y if and only if for all x > 0, hXNBurr(x) ≤ (≥)hYNBurr(x), which is equivalent to

β1λ11+xλ12log1+xλ1c1+α1β11+xλ1c1xλ1c11λ11+xλ1c1()β2λ21+xλ22log1+xλ2c2+α2β21+xλ2c2xλ2c21λ21+xλ2c2.

This holds if (i) λ1 ≥ (≤)λ2, (ii) c1 ≤ (≥)c2, (iii) α1 ≤ (≥)α2 and (iv) β1 ≤ (≥)β2, for β > 0. □

Remark 1.

It is interesting to note that the hazard rate function of the NBurr distribution can be both increasing and decreasing under certain conditions as given in Theorem 3.1. Also, NBurr distribution preserves stochastic and hazard rate orderings as shown in Theorems 3.2 and 3.3, respectively.

The Shannon entropy is an important and well-known concept in information theory as well as engineering sciences. Let X be a random variable that follows NBurr distribution with parameters α, β, λ, c. Then Shannon’s entropy for the NBurr distribution is defined as

H(X)=E[logfNBurr(X)]=0fNBurr(x)logfNBurr(x)dx=0(βλ(x+λ)2log(1+xcλc)+(αβλx+λ)cxc1xc+λc)[1+(xλ)c](αβ1+x/λ)×[log(βλ(x+λ)2log(1+xcλc)+(αβλx+λ)cxc1xc+λc)          (αβ1+xλ)log(1+(xλ)c)]dx.

When we want to study the system that survived up to an age t, then Shannon’s entropy function is not useful in measuring the uncertainty about the residual lifetime of the system. Ebrahimi [3] has introduced residual entropy and defined as

H(X;t)=1tfX(x)SX(t)logfX(x)SX(x)dx,

Then the residual entropy for NBurr distribution with parameters α, β, λ, c is given by

H(X;t)=11SNBurr(t)tfNBurr(x)loghNBurr(x)dx=11+tλcαβ1+t/λ   ×tβλ(x+λ)2log1+xλc+αβλx+λcxc1xc+λc1+xλcαβ1+x/λ   ×logβλ(x+λ)2log1+xλc+αβλx+λcxc1xc+λcdx.

3.2. Parameter estimation

Let x1, x2,…, xn be a sample of size n from NBurr(α, β, λ, c) distribution. We give procedure for parameter estimation the including the log-likelihood functions and corresponding normal equations. The log-likelihood function for the vector of parameters Θ = (α, β, λ, c)T corresponding to NBurr distribution is given by

(3.7)ll(x;α,β,λ,c)=i=1nlogβλλ+xi2log1+xicλc+αβλλ+xicxic1λc+xic                             αi=1nlog1+xicλc+βλi=1n1λ+xilog1+xicλc,

where n is the sample size, and the maximum likelihood estimates of the unknown parameter vector (α, β, λ, c) are those that maximize the log-likelihood function l in (3.7). The normal equations can be obtained by taking the partial derivatives of (3.7) w.r.t. α, β, λ, c and equating them to zero:

(3.8)lα=i=1ncωi2ωi1c1β1+ωi1clog1+ωi1c+cωiαωiβωi1c   i=1nlog1+ωi1c
(3.9)lβ=i=1n1+ωi1clog1+ωi1ccωiωi1cβ1+ωi1clog1+ωi1c+cωiαωiβωi1c    +i=1n1ωilog1+ωi1c
(3.10)lλ=i=1nβxiβλλc+xic2log1+xicλc2cβxicλ+xiλc+xicc2λc1xic1λ+xi2αλ+xiβλβλλ+xiλc+xic2log1+xcλc+cxic1αλ+xiβλλ+xi2λc+xc    +cαλi=1nxicλc+xic+βi=1nxilog1+xcλcλ+xi2cβi=1n1λ+xiλc+xic
(3.11)lc=i=1nβωi1c1+ωi1clogωi1+ωiωi1c1αωiβ1+c+ωi1cβ1+ωi1c2log1+ωi1c+cωiωi1c1αωiβ1+ωi1c    i=1nαωiβωi1clogωi1ωi1+ωi1c,

where ωi=1+xiλ.

The MLEs of the four parameters for the NBurr distribution with α, β, λ and c are obtained by setting the above partial derivatives to zero and solving them simultaneously. The closed-form solutions are not available for the equations (3.8), (3.9), (3.10) and (3.11). So, an iterative algorithm should be applied to solve these equations numerically. For practical implementation of the model, we fit the NBurr models in the whole range of the data sets with quasi-Newton BFGS numerical algorithm with initial values to be chosen as α0^,β0^,λ0^,c0^=(1,1,1,1) to find the MLE estimates of the parameters.

We also present the asymptotic distributions for the NBurr distribution. The Fisher information matrix (I) can be obtained by taking the expected values of the second-order and mixed partial derivatives of (x; α, β, λ, c) w.r.t. α, β, λ and c. Since the analytical expression is hard to compute, it can be approximated by numerically investigating the I = (Iij) matrix. The asymptotic I matrix can be given as follows:

I=2lα22lαβ2lαλ2lαc2lαβ2lβ22lβλ2lβc2lαλ2lβλ2lλ22lλc2lαc2lβc2lλc2lc2

The second order partial derivatives of (x; α, β, λ, c) w.r.t. α, β, λ and c can be calculated but the calculations are very tedious. Hence, we omit the calculation part. The variance-covariance matrix is approximated by M = (Mij) where Mij=Iij1. The asymptotic distribution of MLEs for α, β, λ, and c can be written as

[(α^α),(β^β),(λ^λ),(c^c)]~N40,I1(θ^).

.

Then the approximate 100(1 – k)% confidence intervals for α, β, λ, and c are given by α^±Zk2Var(α^), β^±Zk2Var(β^), λ^±Zk2Var(λ^), and c^±Zk2Var(c^); where Θ^=(α^,β^,λ^,c^) and Zk is the upper 100 k-th percentile of the standard normal distribution.

3.3. Goodness of fit

The measure of closeness between the hypothesized NBurr distribution and the observed real-world network can be well determined by goodness-of-fit test. We have used the Chi-square statistic test and its corresponding p value to determine the goodness of fit for the NBurr distribution. We calculate the respective p values using the bootstrap resampling computational technique as given below:

  • Initially the best fit NBurr distribution can be determined by estimating parameters through the MLE method given network data. Then we calculate the Chi-square statistic value as a measure of goodness-of-fit corresponding to the best-fitted NBurr model.

  • Next we generate 50000 synthetic data sets from the NBurr distribution and calculate the Chi-square statistic for each of the synthetic data sets.

  • Finally, we obtain the p value for the synthetic data sets as the fraction of NBurr synthetic data sets with a Chi-square value greater than the empirical one. Higher p values signify that the proposed model is ‘most’ suitable for the data set.

In addition, the effectiveness of the proposed NBurr distribution compared to other heavytailed distributions, is also verified by computing other well known statistical measures such as Kullback-Leibler divergence (KLDiv), root mean squared error (RMSE), and mean absolute error (MAE).

4. Real-world applications

We show the application of the NBurr distribution in the analysis of large-scale complex network data sets from various disciplines. The examples of such large-scale real-world complex networks include Twitter, Facebook, Orkut, Youtube, Amazon, LinkedIn, Wiki networks, etc. where the number of nodes is of the order of thousands or millions. There has been significant interest and attention devoted toward modeling aspects of such large-scale complex networks. Recent research [23, 28, 44] involved in the analysis of various important structural characteristics of network such as degree distribution, average nearest neighbor, clustering coefficient, community discovery, motif distribution, etc. Most of the interest has been focused on the analysis of the node degree distribution corresponding to these real-world networks [2,6,7]. Empirical observations suggest that the node degree distributions of such real-world networks, for example, collaboration networks, communication networks, social networks, biological networks, etc., follow a heavy-tailed power-law distribution [6, 33]. Previous researchers reported that a baseline power-law, exponential, Pareto, log-normal, and Burr models are insufficient to fit the empirical data properly in its whole range unless some of the lower degree nodes are left out while fitting the model [1012, 40, 43]. In recent work, Broido et al. [8] pointed out that the recent data concentration on all these networks data shows that they no longer follow the power-law distribution.

4.1. Data

We consider large-scale real-world network data sets from different disciplines, namely social networks, collaboration networks, citation networks, web graphs, product co-purchasing networks, temporal networks, communication networks, and ground-truth networks. We study several individual data sets from each discipline to showcase the general applicability of the proposed NBurr distribution. These data sets are publicly available at http://snap.stanford.edu/data/index.html. These are standard network data sets with heavy-tail behaviors and used for modeling in the statistical analysis of networks [3335]. An overview of these publicly available network data sets along with statistical measures (mean (μ), standard deviation (s), etc.) are presented in Table 3. Another interesting property of these network data sets is their coefficient of variation (s/μ) exceeding unity.

Table 3

Network data sets and estimated parameters of the proposed NBurr model

Networks# Nodes# EdgesStatistical measuresEstimated-ParametersBootstrap chi-square value (p)
sμsμα^β^λ^c^
SocialTwitterNet81,3061,768,14957.96521.7472.66543.33950.091843.6930.73240.9740
GplusNet107,61413,673,4531404.8283.424.95681.37370.450324.9240.58370.9890
DeliciousNet536,1081,365,96139.82610.6733.73127.65003.184526.9990.21970.9651
Live JournalNet4,847,57168,993,77344.96915.3682.9267.78073.731779.1020.35360.9720
AthletesFacebookNet13,86686,85917.97812.4381.44532.8158-0.455824.0531.10220.9400
CitationHepThNet27,770352,80743.13915.2202.83422.9198-0.447526.7930.79540.8300
PatentsNet3,774,76816,518,9486.91255.06871.36376.77243.997715.7580.57930.8130
CiteseerNet227,320814,1349.82605.43221.80883.4008-0.815712.2860.83910.6160
WebGoogleNet875,7135,105,03943.3207.14446.063416.5955.671461.7180.11370.9865
BerkStanNet685,2307,600,595300.0812.31624.3640.45220.18331.16262.20580.6550
Wikipedia2009Net1,864,4334,507,31512.8464.89032.62681.8443-0.97063.03290.91470.9750
Product CoPurchasingAmazon0601Net403,3943,387,38815.2798.39891.81912.95341.89079.52740.95080.6920
Amazon0505Net410,2363,356,82815.3138.18261.87144.39823.372412.7390.69650.6670
Amazon0312Net400,7273,200,44415.0737.98651.88736.25235.558316.5630.49710.6517
TemporalMathover flowNet24,818506,55031.47610.4243.01950.54260.70391.10571.61870.9890
SuperuserNet194,0851,443,33923.7825.82394.08360.59870.57410.99201.86850.9800
AskubuntuNet159,316964,43718.4044.38564.19660.59971.12890.73862.00610.9760
CommunicationEmailEnronNet36,692183,83136.10010.0213.60274.11592.64816.33320.31290.9900
WikiTalkNet2,394,3855,021,41012.2592.11955.78440.20912.78530.17333.48210.9760
RecLibimsetiNet220,97017,359,346413.71102.854.02274.66671.8315123.520.21070.9843
GroundTruthWikiTopcatsNet1,791,48928,511,807283.7815.91517.8311.59870.89822.84520.72600.8400
OrkutNet3,072,441117,185,083154.7876.2812.02915.41715.0492137.140.52170.9891
YoutubeNet1,134,8902,987,62450.7545.26509.63982.5928-0.44030.31590.50040.8520

4.2. Experimental results

In this section, we compare the NBurr distribution with the other seven established models, namely Power-law, Pareto, Log-normal, Power-law with cutoff, Burr, exponentiated Burr [26], MO Burr [22] distributions. To estimate the parameters (α, β, λ, c) of the NBurr distribution numerically, we have used ‘optim’ function along with the quasi-Newton L-BFGS-B algorithm in R statistical software by taking the initial parameters value (α, β, λ, c) = (1,1,1,1). The estimated values of the parameters for all the network data sets satisfied the following conditions: α > 0, β; > −1 λ > 0 and c > 0 as depicted in Table 3, which characterize the proposed NBurr distribution. Empirically it is observed that in the case of social networks, the estimated value of the parameter λ attains the higher values as compared to the estimated value of α. From Table 3 it is also clear that the proposed NBurr distribution produces higher p values through bootstrapping chi-square test which suggests that the null hypothesis i.e., “the data are drawn from NBurr distribution”, cannot be ruled out at the 0.05 level of significance. This recommends in favor of the use NBurr distribution for fitting the node degree distribution of a network. Experimental results (given in Table 3) suggests that the proposed NBurr distribution is effective in modeling the entire degree distribution of real-world complex networks. Also, we used some other statistical measures, viz. KLDiv, RMSE, and MAE to compare the performance of the proposed NBurr distribution with the competitive heavy-tailed distributions as shown in Tables 4 and 5.

Table 4

Performances of the proposed NBurr model in terms of RMSE, KLDiv, and MAE compared to the competitive heavy-tailed models over real-world networks

NetworksBurrNBurrExponentiated BurrMO Burr
RMSEKLDivMAERMSEKLDivMAERMSEKLDivMAERMSEKLDivMAE
SocialTwitterNet13.1790.007881.166313.3910.007871.172534.7540.020123.125928.7270.019112.7780
GplusNet2.74110.056210.19941.82170.055720.18499.05260.063820.31934.21280.056790.2171
DeliciousNet29.1470.004991.795213.8100.004531.136420.2910.004641.418344.5500.006272.4668
LiveJournalNet384.810.0016813.924219.730.000505.0602464.510.0033719.028916.550.0095132.342
AthletesFacebookNet4.98570.008861.38764.86690.008851.372110.2880.013612.60428.05270.009651.8504
CitationHepThNet2.67800.013310.47182.47860.013270.456111.9850.020911.05922.81340.013460.4756
PatentsNet626.740.0002464.67468.1766.5E-0510.397173.559.4E-0522.971898.550.0003084.917
CiteseerNet43.7190.002003.646933.4840.001863.162213.1650.002161.851939.5210.001923.4805
WebGoogleNet187.960.006998.3224117.150.004285.8484197.650.006408.1009177.870.007088.2026
BerkStanNet32.3210.026710.535332.3880.026790.535531.9450.026700.526032.1980.026710.5301
Wikipedia2009net176.290.001159.451452.1480.001044.724291.5170.000976.0324168.350.001159.1700
Product CoPurchasingAmazon0601Net92.9120.003157.648577.3870.002836.4876188.630.0071914.03287.1370.002656.9522
Amazon0501Net118.170.003759.139378.7270.002976.9574245.890.0118318.526112.000.003358.5741
Amazon0312Net105.140.003378.190171.2100.002606.0427106.630.003428.208680.5380.002636.4877
TemporalMathoverflowNet8.84780.014321.37206.14010.013811.17136.52650.014211.23029.20840.014321.3693
SuperuserNet17.9430.003191.369211.7090.003031.064218.3260.003191.371718.7970.003191.3977
AskubuntuNet33.0410.003522.101727.9860.003341.865034.2670.003752.146931.8610.003261.9377
CommunicationEmailEnronNet76.7870.035255.235673.1390.033815.066274.2440.034975.181676.1180.035165.2236
WikiTalkNet517.420.0034321.725146.890.001098.2835665.730.0035625.758668.300.0035625.823
RecLibimsetiNet41.3480.032630.901022.8510.034000.765948.0440.066481.415644.4130.062731.3570
GroundTruthWikiTopcat sNet9.61050.001880.09136.17510.001870.076710.7240.001890.09619.20870.001880.0898
OrkutNet194.240.007216.222295.5800.002983.3811654.440.1220635.536116.810.003293.1295
YoutubeNet37.0060.001120.585833.8610.001120.568446.4250.001160.640331.2980.001110.5312
Table 5

Performances of the proposed NBurr model in terms of RMSE, KLDiv, and MAE compared to the competitive heavy-tailed models over real-world networks

NetworksBurrNBurrExponentiated BurrMO Burr
RMSEKLDivMAERMSEKLDivMAERMSEKLDivMAERMSEKLDivMAE
SocialTwitterNet204.350.183110.847354.250.285715.60353.8630.01692.949468.0040.03974.1974
GplusNet53.0640.22990.922186.9550.31131.184710.1550.06780.252330.9250.14750.6821
DeliciousNet349.660.202114.867471.020.134917.874281.340.057910.78166.8960.01854.2366
LiveJournalNet5025.20.1614127.988100.90.1785164.183473.60.035570.64808.790.010124.501
AthletesFacebookNet100.160.204913.387204.910.416423.83915.4610.01272.567425.0990.03244.3180
CitationHepThNet73.5310.17414.0821122.790.25665.99722.590.02552.33125.420.04642.816
PatentsNet27.5K0.22662049.534.8K0.23662533.29612.70.0192725.712424.50.0061271.89
CiteseerNet889.880.330849.4671156.20.291662.026353.260.029921.921195.670.013113.877
WebGoogleNet1809.10.12445.0231809.20.12445.0231514.50.087840.067188.010.01579.6549
BerkStanNet615.030.18634.0722615.010.18634.0721185.040.10022.0198322.630.10372.8203
Wikipedia2009Net4371.90.1352164.584371.90.1352164.582720.90.0798116.43781.870.008235.531
Product CoPurchasingAmazon0601Net1495.40.270870.2812539.80.4022114.59286.610.010216.881297.390.038222.199
Amazon0505Net1572.90.246373.0032494.50.3711111.56358.590.012519.123260.850.034220.136
Amazon0312Net1564.40.242571.8752462.90.3686109.21338.030.011617.742273.390.035220.381
TemporalMathoverflowNet213.910.213113.600213.820.213213.61241.9340.06345.316192.6030.08617.9912
SuperuserNet900.040.180833.837900.330.180833.839243.420.061613.199354.720.057016.613
AskubuntuNet949.660.209139.419949.730.209139.420212.910.064912.451389.140.071920.113
CommunicationEmailEnronNet246.510.177914.886245.250.177814.859121.470.08738.44595.4680.06897.664
WikiTalkNet9669.40.3376293.639669.40.3376293.637978.60.1902246.26672.320.003625.905
RecLibimsetiNet77.0810.21982.1486133.910.20962.744187.4720.07551.402128.0590.03590.6971
GroundtruthWikiTopcatsNet565.210.13772.6145930.440.16123.8073272.990.04641.5159389.860.06292.2289
OrkutNet2443.60.549880.7124299.30.8033101.64452.920.045919.624261.750.047916.197
YoutubeNet1380.50.134215.6901380.50.134215.6911422.20.141617.219143.790.00452.2564

Given a network, the information about the differences between actual and predicted degree frequencies can be determined by calculating the root mean squared error (RMSE) and mean absolute error (MAE). The higher similarity between actual and predicted distributions is achieved by generating smaller values of RMSE and MAE. From Tables 4 and 5, it is clear that the proposed NBurr distribution produces smaller RMSE and MAE values than other competitive distributions. This indicates that the proposed NBurr distribution is competitive in almost all the networks, except a few, where Burr, MO Burr, and power-law cutoff distribution perform better than the proposed one. Power-law and Pareto distributions provide lower performance compared to the other competing distributions in terms of RMSE and MAE measures over all the real-world networks, as clearly seen from Table 5. Similar performances have been observed through Table 4 in the case of Burr and MO Burr distributions as they produce similar RMSE and MAE values. The dissimilarity between two probability distributions can also be measured by calculating KLD values. The higher similarity between the actual and the predicted distribution is achieved by generating high KLD values. The proposed NBurr distribution produces smaller KLD values compared to others in almost all the networks as clearly seen from Tables 4 and 5. This in-turn suggests that the proposed NBurr distribution is much closure to the observed degree distribution and always superior to the state-of-the-art models in almost all the networks.

Thus in conclusion we can say that the proposed NBurr distribution performs better than the competing distributions by considering all the measures (RMSE, MAE, and KLDiv) together. This suggests/confirms that the observed distribution corresponding to a real-world network plausibly comes from the proposed NBurr distribution.

The scatter plot of the fitted results can be used to verify the effectiveness of the proposed NBurr distribution. To do this, the log-log plots of the original frequency distribution, the estimated frequency by NBurr distribution, and the frequency estimated by Burr, Exponentiated Burr, MO Burr, Power-law, Pareto Type-I, Log-normal and Power-law Cutoff distributions are drawn corresponding to a real-world network. For more clarification, eight such plots are given in Figures 25. These are the TwitterNet, LiveJournalNet, CiteseerNet, BerkStanNet, Amazon0601Net, SuperuserNet, EmailEnronNet, and WikiTopcatsNet. From Figures 25, it is quite clear that the proposed NBurr curve always passes through the middle of the scatter plot of each of the observed distribution. This signifies that the proposed NBurr distribution provides a better fit compared to the other competitive distributions in almost all of the networks. Thus, it is visually clear, through observing the plotted results, we may now conclude that the entire node degree distribution can be better represented by the NBurr distribution compared to other heavy-tailed distributions. Hence the proposed NBurr distribution, a modification of the Burr distribution with nonlinear exponent in the shape parameter, can be used for effective and efficient modeling of the entire degree distribution of real-world networks without ignoring the lower degree nodes. Finally, it is clear that these heavy-tailed network data sets when modeled in the whole range using the proposed NBurr distribution shows significant improvements in comparison to state-of-the-art models.

FIGURE 2. Degree distribution of TwitterNet and LiveJournalNet in log-log scale
FIGURE 2.

Degree distribution of TwitterNet and LiveJournalNet in log-log scale

FIGURE 3. Degree distribution of CiteseerNet and BerkStanNet in log-log scale
FIGURE 3.

Degree distribution of CiteseerNet and BerkStanNet in log-log scale

FIGURE 4. Degree distribution of Amazon0601Net and SuperuserNet in log-log scale
FIGURE 4.

Degree distribution of Amazon0601Net and SuperuserNet in log-log scale

FIGURE 5. Degree distribution of EmailEnronNet and WikiTopcatsNet in log-log scale
FIGURE 5.

Degree distribution of EmailEnronNet and WikiTopcatsNet in log-log scale

5. Conclusion

In this paper, we present a new method called shape parameter transformation (SPT) to extend Burr and related families. The SPT method’s idea is to use a nonlinear exponent (depends on data and an additional parameter) instead of using a constant shape parameter in the Burr, Pareto, and exponentiated family of distributions. The method was first applied to the Burr distribution and a generalized Burr (GBurr) model is introduced. These newly introduced GBurr models belong to the maximum domain of attraction of the Frechet distribution and are right-tail equivalent to the Pareto distribution. Further, the SPT method is applied to the classical Pareto and exponentiated distributions. We also studied a relevant model, namely NBurr distribution, from the generalized Burr family and derived various statistical properties. The practical usefulness of the proposed NBurr distribution was shown using multiple heavy-tailed network data sets from different disciplines. It is interesting to note that the NBurr distribution is a new competitor for popularly used MO Burr and exponentiated Burr distributions. An immediate extension of this work is to apply these newly introduced probability models for modeling survival and lifetime data sets. Another possible extension of this work would be to look for implementing the proposed SPT method for other size distributions available in the statistics paradigm.


(Communicated by Gejza Wimmer)


Acknowledgement

The authors are thankful to Professor Gopal K. Basak of Indian Statistical Institute, Kolkata for constructive comments and insightful suggestions in this paper.

REFERENCES

1 [1] AFIFY, A. Z.—CORDEIRO, G. M.—ORTEGA, E. M.—YOUSOF, H. M.—BUTT, N. S.: The four-parameter Burr XII distribution: Properties, regression model, and applications, Comm. Statist. Theory Methods 47 (2018), 2605–2624.10.1080/03610926.2016.1231821Search in Google Scholar

2 [2] AMARAL, L. A. N.—SCALA, A.—BARTHELEMY, M.—STANLEY, H. E.: Classes of small-world networks, Proc. Natl. Acad. Sci. USA 97 (2000), 11149–11152.10.1073/pnas.200327197Search in Google Scholar

3 [3] ASADI, M.—EBRAHIMI, N.: Residual entropy and its characterizations in terms of hazard function and mean residual life function, Statist. Probab. Lett. 49 (2000), 263–269.10.1016/S0167-7152(00)00056-0Search in Google Scholar

4 [4] ASADL, M.: Characterization of the pearson system of distributions based on reliability measures, Statist. Papers 39 (1998), 347–360.10.1007/BF02927098Search in Google Scholar

5 [5] AUSTIN, J. A.: Control chart constants for largest and smallest in sampling from a normal distribution using the generalized Burr distribution, Technometrics 15 (1973), 931–933.10.1080/00401706.1973.10489126Search in Google Scholar

6 [6] BARABÁI, A. L.: The origin of bursts and heavy tails in human dynamics, Nature 435 (2005), 207–211.10.1038/nature03459Search in Google Scholar PubMed

7 [7] BARABÁSI, A. L.—ALBERT, R.: Emergence of scaling in random networks, Science 286 (1999), 509–512.10.1515/9781400841356.349Search in Google Scholar

8 [8] BROIDO, A. D.—CLAUSET, A.: Scale-free networks are rare, Nature Communications 10 (2019), 1–10.10.1038/s41467-019-08746-5Search in Google Scholar PubMed PubMed Central

9 [9] BURR, I. W.: Cumulative frequency functions, Ann. Math. Statist. 13 (1942), 215–232.10.1214/aoms/1177731607Search in Google Scholar

10 [10] CHATTOPADHYAY, S.—CHAKRABORTY, T.—GHOSH, K.—DAS, A. K.: Uncovering patterns in heavytailed networks: A journey beyond scale-free. In: 8th ACM IKDD CODS and 26th COMAD, 2021.10.1145/3430984.3431021Search in Google Scholar

11 [11] CHATTOPADHYAY, S.—MURTHY, C. A.—PAL, S. K.: Fitting truncated geometric distributions in large scale real world networks, Theoret. Comput. Sci. 551 (2014), 22–38.10.1016/j.tcs.2014.05.003Search in Google Scholar

12 [12] CLAUSET, A.—SHALIZI, C. R.—NEWMAN, M. E. J.: Power-law distributions in empirical data, SIAM Review 51 (2009), 661–703.10.1137/070710111Search in Google Scholar

13 [13] DOMMA, F.: Some properties of the bivariate Burr type III distribution, Statistics 44 (2010), 203–215.10.1080/02331880902986547Search in Google Scholar

14 [14] DUNNING, K. A.—HANSON, J. N.: Generalized pearson distributions and nonlinear programing, J. Stat. Comput. Simul. 6 (1977), 115–128.10.1080/00949657708810176Search in Google Scholar

15 [15] EMBRECHTS, P.—KLÜPPELBERG, C.—MIKOSCH, T.: Modelling Extremal Events: for Insurance and Finance, Springer Science & Business, Vol. 33, 2013.10.1007/BF01440733Search in Google Scholar

16 [16] FISK, P. R.: The graduation of income distributions, Econometrica 29 (1961), 171–185.10.2307/1909287Search in Google Scholar

17 [17] GOMES, A. E.—DA SILVA, C. Q.—CORDEIRO, G. M.: Two extended Burr models: Theory and practice, Comm. Statist. Theory Methods 44 (2015), 1706–1734.10.1080/03610926.2012.762402Search in Google Scholar

18 [18] GUPTA, R. C.—GUPTA, P. L.—GUPTA, R. D.: Modeling failure time data by lehman alternatives, Comm. Statist. Theory Methods 27 (1998), 887–904.10.1080/03610929808832134Search in Google Scholar

19 [19] GUPTA, R. D.—GUPTA, R. C.: Analyzing skewed data by power normal model, Test 17 (2008), 197–210.10.1007/s11749-006-0030-xSearch in Google Scholar

20 [20] GUPTA, R. D.—KUNDU, D.: Generalized exponential distributions, Aust. N. Z. J. Stat. 41 (1999), 173–188.10.1111/1467-842X.00072Search in Google Scholar

21 [21] JAMAL, F.—CHESNEAU, C.—NASIR, M. A.—SABOOR, A.—ALTUN, E.—KHAN, M. A.: On a modified Burr XII distribution having flexible hazard rate shapes, Math. Slovaca 70 (2020), 193–212.10.1515/ms-2017-0344Search in Google Scholar

22 [22] JAYAKUMAR, K.—MATHEW, T.: On a generalization to Marshall-Olkin scheme and its application to Burr type XII distribution, Statist. Papers 49 (2008), 421–439.10.1007/s00362-006-0024-5Search in Google Scholar

23 [23] KIM, M.—LESKOVEC, J.: Multiplicative attribute graph model of real-world networks, Internet Math. 8 (2012), 113–160.10.2172/1124904Search in Google Scholar

24 [24] KLEIBER, C.—KOTZ, S.: Statistical Size Distributions in Economics and Actuarial Sciences, John Wiley & Sons 470, 2003.10.1002/0471457175Search in Google Scholar

25 [25] KUMAR, D.: The Burr type XII distribution with some statistical properties, J. Data Sci. 16 (2017), 509–534.10.6339/JDS.201707_15(3).0008Search in Google Scholar

26 [26] KUMAR, D.—SARAN, J.—JAIN, N.: The exponentiated Burr XII distribution: moments and estimation based on lower record values, Sri Lankan J. Appl. Stat. 18 (2017), 1–18.10.4038/sljastats.v18i1.7930Search in Google Scholar

27 [27] LEHMANN, E. L.: The power of rank tests, Ann. Math. Statist. 24 (1953), 23–43.10.1007/978-1-4614-1412-4_33Search in Google Scholar

28 [28] LESKOVEC, J.—CHAKRABARTI, D.—KLEINBERG, J.—FALOUTSOS, C.—GHAHRAMANI, Z.: Kronecker graphs: An approach to modeling networks, J. Mach. Learn. Res. 11 (2010), 985–1042.Search in Google Scholar

29 [29] LOMAX, K. S.: Business failures: Another example of the analysis of failure data, J. Amer. Statist. Assoc. 49 (1954), 847–852.10.1080/01621459.1954.10501239Search in Google Scholar

30 [30] MARSHALL, A. W.—OLKIN, I.: A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families, Biometrika 84 (1997), 641–652.10.1093/biomet/84.3.641Search in Google Scholar

31 [31] MUDHOLKAR, G. S.—SRIVASTAVA, D. K.—FREIMER, M.: The exponentiated Weibull family: A reanalysis of the bus-motor-failure data, Technometrics 37 (1995), 436–445.10.1080/00401706.1995.10484376Search in Google Scholar

32 [32] NADARAJAH, S.—KOTZ, S.: The exponentiated type distributions, Acta Appl. Math. 92 (2006), 97–111.10.1007/s10440-006-9055-0Search in Google Scholar

33 [33] NEWMAN, M. E. J.: The structure of scientific collaboration networks, Proc. Natl. Acad. Sci. USA 98 (2001), 404–409.10.1515/9781400841356.221Search in Google Scholar

34 [34] NEWMAN, M. E. J.: The structure and function of complex networks, SIAM Review 45 (2003), 167–256.10.1137/S003614450342480Search in Google Scholar

35 [35] NEWMAN, M. E. J.: Power laws, Pareto distributions and Zipf’s law, Contemp. Phys. 46 (2005), 323–351.10.1080/00107510500052444Search in Google Scholar

36 [36] PARANAíBA, P. F.—ORTEGA, E. M.—CORDEIRO, G. M.—PESCIM, R. R.: The beta Burr XII distribution with application to lifetime data, Comput. Statist. Data Anal. 55 (2011), 1118–1136.10.1016/j.csda.2010.09.009Search in Google Scholar

37 [37] PEARSON, K.: Contributions to the mathematical theory of evolution, Philosophical Transactions of the Royal Society of London 185 (1894), 71–110.10.1098/rsta.1894.0003Search in Google Scholar

38 [38] RODRIGUEZ, R. N.: A guide to the Burr type XII distributions, Biometrika 64 (1977), 129–134.10.1093/biomet/64.1.129Search in Google Scholar

39 [39] SÁNCHEZ, E.: Burr type XII as a superstatistical stationary distribution, Physica A: Stat. Mech. Appl. 516 (2019), 443–446.10.1016/j.physa.2018.10.044Search in Google Scholar

40 [40] STUMPF, M. P.—WIUF, C.—MAY, R. M.: Subnets of scale-free networks are not scale-free: sampling properties of networks, Proc. Natl. Acad. Sci. USA 102 (2005), 4221–4224.10.1073/pnas.0501179102Search in Google Scholar PubMed PubMed Central

41 [41] TADIKAMALLA, P. R.: A look at the Burr and related distributions, Int. Stat. Rev. 48 (1980), 337–344.10.2307/1402945Search in Google Scholar

42 [42] TAKAHASI, K.: Note on the multivariate Burr’s distribution, Ann. Inst. Statist. Math. 17 (1965), 257–260.10.1007/BF02868169Search in Google Scholar

43 [43] VOITALOV, I.—HOORN, P. V.—HOFSTAD, R. V.—KRIOUKOV, D.: Scale-free networks well done, Phys. Rev. Research 1 (2019), Art. 033034.10.1103/PhysRevResearch.1.033034Search in Google Scholar

44 [44] YANG, J.—LESKOVEC, J.: Defining and evaluating network communities based on ground-truth, Knowl. Inf. Syst. 42 (2015), 181–213.10.1007/s10115-013-0693-zSearch in Google Scholar

Received: 2020-09-19
Accepted: 2021-01-11
Published Online: 2022-02-16
Published in Print: 2022-02-16

© 2022 Mathematical Institute Slovak Academy of Sciences

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 14.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/ms-2022-0016/html
Scroll to top button