Uploaded by Arushkrishna A

edited chapter 8-16

advertisement
CHAPTER VIII
A NEW MODIFICATION OF QUASI GARIMA DISTRIBUTION WITH
PROPERTIES AND APPLICATIONS IN BIOSTATISTICS
8.1. Introduction
The study introduces Length Biased Quasi Garima distribution (LBQG) distribution
as a new generalization of Quasi Garima distribution. The different statistical properties of
new distribution such as moments, order statistics, survival analysis, Bonferroni and Lorenz
curves have been studied and investigated. The parameters of proposed new distribution are
estimated by using the technique of maximum likelihood estimator and also its Fisher’s
information matrix have been discussed. Finally a new distribution has been fitted with real
data sets for examining its superiority.
Quasi Garima (QG) distribution is a newly executed two parametric lifetime model
proposed by Shanker et al. (2019) and the proposed QG distribution is a special case of one
parameter Exponential and Garima distribution. Its different mathematical and statistical
properties including moments and moments based measures, hazard rate function, mean
residual life function, stochastic ordering, mean deviations, Bonferroni and Lorenz curves,
order statistics, Renyi entropy measure and stress strength reliability have been discussed.
For estimating its parameters the two methods namely the method of moments and method
of maximum likelihood estimation have been used. A goodness of fit of QG distribution
have also been discussed by using a real lifetime data set from engineering and the fit has
been found quite satisfactory over one parameter exponential, Lindley, Garima and two
parameter quasi Shanker, gamma, weibull and lognormal distributions. Shanker (2016)
pointed out the Garima distribution with behavioral science applications, discuss its several
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
112
statistical properties and estimate its parameters through method of moments and method of
maximum likelihood estimation.
The probability density function of QGD is given by
f ( x; θ , α )=
θ2
θ 2 +θ+ α
( 1+θ+αx ) e− θx ; x >0 , θ> 0 , α >0
(1 )
and the cumulative distribution function of quasi Garima distribution is given by
(
F( x ; θ , α )=1− 1+
α θx
e − θx ; x >0 , θ> 0 , α> 0
θ +θ +α
2
)
(2)
8.2. Length Biased Quasi Garima (LBQG) Distribution
The weighted distributions provide a suggestive approach in distribution theory to
deal with new understanding of the existing classical distributions. The concept of weighted
distributions introduced by Fisher (1934) provides a collective access for the problem of
model specification and data interpretation problems. Later Rao (1965) formulized in
general terms to deal with modelling statistical data when the usual practice of using
classical distributions was found to be inappropriate. The weighted distributions are applied
in various fields such as biomedicine, ecology, reliability, analysis of family data, Metaanalysis, analysis of intervention data and other areas for the development of proper
statistical models. The weighted distributions occur in a natural way in specifying
probabilities of events as observed and recorded by making adjustments to probabilities of
actual occurrence of events taking into account methods of ascertainment and failure to
make such adjustments can lead to wrong conclusions. When observations are recorded by
an investigator in the nature according to certain stochastic model, the distribution of
recorded observations will not have the original distribution unless every observation is
given an equal chance of being recorded.
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
113
The weighted distributions also provides a technique in fitting a models to the
unknown weight function when the samples can be taken both from original and developed
distributions and it is employed to modify the probabilities of events as observed and
transcribed. The weighted distributions occur in modelling clustered sampling,
heterogeneity and extraneous variation in the data set. The weighted distribution reduces to
length biased distribution when the weight function considers only the length of the units.
The concept of length biased sampling was first introduced by Cox (1969) and Zelen
(1974). The statistical interpretation of length biased distribution was originally identified
by Cox (1962) in the context of renewal theory. More generally, when the sampling
mechanism selects units with probability proportional to measure of the unit size, resulting
distribution is called size biased. There are various good sources which provide the detailed
description of length biased distributions. Various newly introduced distributions along with
their length biased versions exist in literature whose statistical behaviour is extensively
studied during decades. Much work on length biased distributions was published, for
example; Das and Roy (2011) proposed length biased weighted Weibull distribution. Rather
and Subramanian (2018) studied the length biased weighted generalized uniform
distribution. Mudasir and Ahmad (2018) presented the Length Biased Nakagami distribution
with properties and applications. Simmachan et al. (2018) discussed a new lifetime
distribution based on the re-parameterizations model called two-sided length-biased inverse
Gaussian distribution (TSLBIGD). Kersey and Oluyede (2012) pointed out the length biased
inverse weibull distribution. Rajagopalan et al. (2019) studied the length biased Aradhana
distribution with applications. Subramanian and Shenbagaraja (2020) discussed on the
length biased quasi Sujatha distribution with properties and applications. Ayesha (2017)
presented the size-biased Lindley distribution with its statistical properties and applications.
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
114
Shanker and Shukla (2018) discussed on a generalized size-biased poisson-Lindley
distribution and its application to model size distribution of freelyforming small groups.
Modi and Gill (2015) proposed the length biased weighted Maxwell distribution. Rather and
Subramanian (2018) studied the length biased Sushila distribution with applications.
Recently, Ganaie and Rajagopalan (2020) discussed on the length biased two parameter
Pranav distribution with characterizations and its applications.
Suppose the non-negative random variable X has probability density function
f (x ) Let its non-negative weight function be w (x ) , then the probability density
function of weighted random variable
f w ( x )=
w( x )f (x )
E(w ( x )) ,
X w is given by
x>0 .
Where the non-negative weight function be w( x) and E( w( x))=∫ w( x )f ( x)dx<∞ .
Different choices of the weight function w(x), weighted models are of various forms
particularly when w(x)= xc, the study is called weighted distribution. In this paper, we have
to study the LBQG distribution, so for obtaining the length biased version of quasi Garima
distribution, we will take consequently w(x) = x, then the probability density function of
length biased distribution is given by
f l( x) =
xf ( x )
E( x )
(3)
∞
Where E( x) = ∫ xf ( x ; θ,α )dx
0
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
115
( θ+θ2 + 2 α )
E ( x )=
θ( θ 2 +θ+ α )
(4)
Substitute equations (1) and (4) in equation (3), we will get the probability density function of LBQG
distribution
3
f l ( x )=
θ
x ( 1+θ+ αx ) e− θx
2
θ+θ +2 α
( 5)
and the cumulative distribution function of LBQG distribution can be obtained as
x
Fl ( x )=∫ f l ( x )dx
0
x
θ3
Fl ( x )=∫
(1+θ +αx )xe−θx dx
2
0 θ+θ +2 α
x
θ3
Fl ( x )=
(1+θ+αx )x e−θx dx
∫
2
θ+θ +2 α 0
θ3
Fl ( x )=
θ +θ 2 +2 α
x
(∫
0
x
−θx
xe
dx + θ ∫ xe
Put θx = t ⇒ θ dx=dt ⇒ dx=
Also x=
x
−θx
0
dt
,
θ
2 −θx
dx +α ∫ x e
0
dx
)
(6)
When x→ x , t→θx and when x →0 , t →0
t
θ
After simplification of equation (6), we will obtain the cumulative distribution function of LBQG
distribution
Fl ( x )=
1
( θ γ ( 2, θx )+θ2 γ ( 2 , θx )+ αγ (3 , θx ) )
2
θ+θ + 2 α
(7 )
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
116
Figure 8.1pdf plot of LBQG distributio
Figure 8.2 cdf plot of LBQG distribution
8.3. Survival Analysis
In this section, derive the survival function, hazard rate and reverse hazard rate
functions of the LBQG distribution.
a). Survival function
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
117
The survival function is defined as the probability that a system survives beyond a
specified time and is also known as compliment of the cumulative distribution function. The
survival function or reliability function of LBQG distribution can be obtained as
S ( x)=1−Fl (x )
S ( x )=1−
1
( θ γ (2 , θx )+θ2 γ (2 , θx )+α γ (3,θx ))
2
θ +θ +2α
b). Hazard function
The hazard function is also known as instantaneous failure rate or force of mortality
and is given by
h( x )=
fl( x)
1−F l ( x )
h( x )=
xθ3 (1+θ+αx )e−θx
(θ +θ2 +2 α )−(θ γ (2 , θx )+θ2 γ(2 , θx )+α γ (3 , θx ))
c). Reverse hazard function
The reverse hazard function of LBQG distribution is given by
hr ( x )=
f l( x )
Fl( x )
xθ3 (1+θ+αx )e−θx
hr ( x )=
(θ γ(2 , θx )+θ2 γ (2 , θx )+αγ (3 , θx ))
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
118
Figure 8.3 survival polt of LBQG distribution
Figure 8.4 Hazard polt of LBQG distribution
8.4. Structural Measures
In this section, various statistical properties of LBQG distribution have been
investigated including its moments, harmonic mean, moment generating function and
characteristic function.
8.4.1 Moments
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
119
Suppose the random variable X represents LBQG distribution with parameters θ and
α, then the rth order moment E(X r) of LBQG distribution can be obtained as
∞
r
E( X )= μ r ' =∫ x r f l ( x)dx
0
∞
3
θ
E( X )= μr ' =∫ x
x(1+θ+αx)e−θx dx
2
θ +θ +2 α
0
r
r
3
∞
θ
E( X )=μr ' =
x r + 1 (1+θ+αx )e−θx dx
∫
2
θ+θ +2α 0
∞
∞
∞
3
θ
( r+2 ) − 1 −θx
( r+2) − 1 −θx
r
E( X )= μr ' =
x
e dx +θ∫ x
e dx+ α ∫ x (r +3 ) − 1 e−θx dx
∫
2
θ+θ +2 α 0
0
0
r
(
)
(8)
After simplification of equation (8), we obtain
θ Γ ( r +2)+θ 2 Γ ( r +2)+ α Γ ( r +3 )
E( X ) = μ r ' =
θ r ( θ+ θ2 +2 α )
r
(9 )
By putting r = 1, 2, 3 and 4 in equation (9), we will get the first four moments of LBQG
distribution.
2
2 θ+2 θ +6 α
E( X ) =μ1 ' =
θ (θ +θ2 +2 α)
2
6 θ+6 θ +24 α
E( X )= μ 2 ' = 2
θ (θ+θ 2 +2 α )
2
2
24 θ+24 θ +120 α
E( X )= μ3 ' = 3
θ (θ +θ 2 +2 α )
3
2
120 θ+120 θ +720 α
E( X )= μ4 ' =
θ 4 (θ +θ 2 +2 α )
4
Variance =
6 θ +6 θ2 + 24 α
2θ+2 θ2 +6 α
−
θ 2 (θ+θ 2 +2 α )
θ(θ+θ2 +2 α )
(
2
)
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
120
6 θ +6 θ2 +24 α 2 θ+2 θ2 +6 α
S . D(σ )= 2
−
2
2
θ (θ +θ +2 α ) θ(θ+θ +2 α )
√
(
2
)
8.4.2 Harmonic mean
The harmonic mean of proposed LBQG distribution can be obtained as
1 ∞1
H . M=E =∫ f l ( x )dx
x 0 x
()
3
∞
θ
H . M=∫
(1+θ+αx )e−θx dx
2
0 θ+θ +2 α
∞
3
θ
H . M=
(1+θ+αx ) e−θx dx
∫
2
θ+θ +2 α 0
3
θ
H . M= 2
θ+θ +2 α
(
∞
∞
∫e
−θx
0
dx + θ∫ e
∞
−θx
0
dx + α ∫ xe−θx dx
0
)
After simplification of above equation, we obtain
HM =
θ3
1
α
+1+ 2
2
θ
θ+ θ +2 α
θ
HM =
θ ( θ 2+ θ+α )
θ+θ2 +2 α
(
)
8.4.3 Moment Generating Function and Characteristic Function
Moment generating function is another alternative specification in probability theory and
statistics for finding the moments of a distribution. Let X be a random variable following LBQG
distribution, then the MGF of X can be obtained as
tx
∞
M X (t )=E (e )=∫ etx f l ( x)dx
0
Using Taylor series, we get
∞
(
M X (t )=∫ 1+tx+
0
(tx )2
+.. .. f l ( x )dx
2!
)
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
121
j
∞ ∞
M X (t )=∫ ∑
0 j=0
∞
M X (t )= ∑
j=0
∞
M X (t )= ∑
j=0
t j
x f ( x)dx
j! l
tj
μ'
j! j
t j θ Γ ( j+2)+θ 2 Γ ( j+2 )+α Γ ( j+3)
j
2
j!
θ (θ +θ +2 α )
(
)
j
∞
1
t
M X (t )=
( θ Γ ( j+2)+θ2 Γ ( j+2)+α Γ ( j+3))
∑
2
j
(θ+θ +2 α ) j=0 j!θ
Similarly, the characteristic function of LBQG distribution can be obtained as
ϕ X (t )=M X (it )
j
∞
1
it
M X (it )=
( θ Γ ( j+2 )+θ 2 Γ ( j+2 )+α Γ ( j+3 ))
∑
2
j
(θ+θ +2α ) j=0 j !θ
8.5. Order Statistics
Order statistics is a useful concept in statistical sciences and have wide range of
applications in modeling auctions, car races and insurance policies. Suppose X(1), X(2) ,…, X(n)
denote the order statistics of a random sample X1, X2,…, Xn drawn from a continuous population
with probability density function fx(x) and cumulative distribution function FX(x), then the
probability density function of rth order statistics X(r) is given by
f x ( r) ( x )=
n!
f X ( x )( F X ( x) )r−1 ( 1−F X ( x) ) n−r
(r−1)!(n−r )!
By substituting equations (5) and (7) in equation (10), we will get the probability density function
of rth order statistics of LBQG distribution.
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
(10)
122
n!
θ3
f x(r) ( x )=
x (1+θ +αx )e−θx
2
(r−1 )!(n−r)! θ +θ +2α
(
)
r−1
1
2
×
(θ γ(2 , θx )+θ γ (2, θx )+α γ (3 , θx ))
θ +θ2 +2α
(
)
(
× 1−
n−r
1
2
(
θ
γ
(2,
θx
)+θ
γ(2,
θx
)+α
γ
(3
,
θx)
)
θ +θ 2 +2α
)
Therefore, the probability density function of higher order statistic X(n) of LBQG distribution can be
obtained as
nθ3
f x( n) (x )=
x (1+θ+αx)e−θx
2
θ +θ +2 α
n−1
1
2
×
(θ γ(2 , θx )+θ γ (2, θx )+α γ(3 , θx ))
θ +θ2 +2 α
(
)
and the probability density function of first order statistic X(1) of LBQG distribution can be obtained
as
f x ( 1) ( x )=
(
× 1−
nθ 3
x(1+θ+αx)e−θx
2
θ +θ +2 α
n−1
1
2
(
θ
γ(2
,
θx
)+θ
γ
(2
,
θx
)+α
γ
(3
,
θx
)
)
θ +θ2 +2 α
)
8.6. Bonferroni and Lorenz Curves
The bonferroni and lorenz curves are also known as income distribution curves and
are used in economics to study the distribution of inequality in income or poverty.
Nowadays it is also being used in various other fields like reliability, medicine, insurance
and demography. The bonferroni and lorenz curves are defined as
q
1
B ( p )=
∫ xf ( x )dx
pμ1 ' 0
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
123
q
1
L( p )= pB( p )= ∫ xf (x )dx
μ1 ' 0
2
2 θ+2θ +6 α
Where μ 1 ' =
θ(θ +θ2 +2 α )
and q=F −1 ( p )
q
θ(θ +θ2 + 2α )
θ3
B ( p )=
x 2 (1+θ+ αx)e−θx dx
∫
2
2
p(2 θ+2 θ +6 α ) 0 θ + θ +2 α
q
θ4
B ( p )=
x 2 (1+ θ+αx )e−θx dx
∫
2
p(2 θ+2 θ +6 α) 0
θ4
B ( p )=
p(2 θ+2 θ2 +6 α)
(
q
∫x
0
q
3 − 1 −θx
e
q
dx+θ ∫ x
3 − 1 −θx
0
e
dx +α ∫ x
4 − 1 −θx
0
e
dx
)
After simplification of above equation, we get
4
θ
B ( p )=
( γ (3 , θq )+θ γ (3 , θq )+α γ( 4 , θq ) )
p(2 θ+2 θ2 +6 α)
4
θ
L( p )=
( γ(3 , θq)+θ γ(3 , θq)+α γ (4 , θq ) )
(2 θ +2θ2 +6 α )
8.7 Maximum Likelihood Estimation and Fisher’s Information Matrix
In this section, we will discuss the parameter estimation of LBQG distribution by
using the technique of maximum likelihood estimator and also derive its Fisher’s
information matrix. Suppose the random sample X1, X2,….,Xn of size n drawn from the
LBQG distribution, then the likelihood function can be written as
n
L( x )=∏ f l ( x )
i=1
n
L( x )=∏
i=1
(
−θx
θ3
i
x
(1+θ+
αx
)e
i
i
2
(θ +θ + 2α )
)
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
124
n
−θx
θ3 n
L( x )=
x (1+θ+αxi )e i
2
n∏ i
(θ + θ +2 α ) i=1
(
)
The log likelihood function is given by
n
n
n
log L=3 n log θ−n log ( θ+θ 2 +2 α )+ ∑ log x i + ∑ log ( 1+θ+αx i )−θ ∑ xi
i=1
i=1
( 11 )
i=1
Now differentiating the above equation (11) with respect to parameters θ and α. We obtain the
normal equations as
n
n
∂log L 3 n
2θ
1
= −n
+∑
−∑ x =0
2
∂θ
θ
( θ+θ +2 α ) i=1 ( 1+θ+αx i ) i=1 i
) (
(
)
n
xi
∂log L
2
=−n
+
=0
∑
2
∂α
( θ +θ +2 α ) i=1 (1+θ+ αx i )
) (
(
)
The above system of nonlinear equations are too complicated to solve it
algebraically, therefore we use numerical technique like Newton-Raphson method for
estimating the required parameters of the proposed distribution.
We use the asymptotic normality results to obtain confidence interval, we have that
^λ=( θ^ , α^ ) denotes the MLE of λ=(θ , α ). We can state the results as follows .
if
√ n( ^λ−λ )→N 2(0 , I−1 ( λ ))
Where I −1( λ ) is Fisher's information matrix . i . e.,
2
∂
1 E logL
I( λ)=− ¿ ∂θ 2
n
∂2 log L
E
¿¿
∂θ ∂α ¿
¿
( ( ) ( ))
Where
E
(
∂2 log L
∂θ 2
)
=−
2(θ+θ2 +2 α )− 4 θ2 n
3n
1
−n
−
∑
2
2
2
θ2
(θ +θ +2 α )
i=1 (1+θ +αx i )
(
) (
)
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
125
E
E
∂2 log L
n
E ( xi )
4
= −n
−∑
(θ +θ 2 +2α )2 i=1 (1+θ+αxi )2
( ) (
∂ α2
∂2 log L
( ∂ θ ∂α ) (
=n
)
(
2
n
E( x i )
4θ
−
∑
( θ+θ 2 +2 α )2 i =1 ( 1+θ+ αx i )2
) (
)
)
Since λ being unknown, we estimate I −1 ( λ) by I −1 ( ^λ) and this can be used to obtain asymptotic
confidence intervals for θ and α .
8.8 Data Analysis
In this section, discussion was made on the goodness of fit by analysing real data
sets in LBQG distribution to show that the LBQG distribution fits better as compared to
quasi Garima, Garima, exponential and Lindley distributions. 603 subjects were randomly
selected from various hospitals in the two districts, Palakkad and Malappuram - at Kerala to
make real data analysis. R software is employed to estimate the unknown parameters along
with the model comparison criterion values. In order to compare the LBQG distribution with
quasi Garima, Garima, exponential and Lindley distributions, we apply the AIC (Akaike
Information Criterion), AICC (Akaike Information Criterion Corrected), BIC (Bayesian
Information Criterion) and -2logL. The better distribution is which corresponds to lower
values of AIC, BIC, AICC and -2logL. For calculating AIC, BIC, AICC and -2logL can be
evaluated by using the formulas as follows:
AIC=2k −2 log L,
BIC=k log n−2 log L
and
AICC= AIC +
2 k ( k +1)
n−k−1
Where k is the number of parameters in the statistical model, n is the sample size and
–2logL is the maximized value of log-likelihood function under the considered model.
Table 8.1: Comparison of fitted distributions
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
126
Distribution
MLE
α^ =6.260663
^θ=9.805593
LBQG
Quasi Garima
α^ =3355.054889
^θ=0.653618
S.E
α^ =1.677726
^θ=7.132346
α^ =0.0102445
^θ=0.000000
Garima
^θ=0.47847006
^θ=0.05103353
Exponential
^θ=0.32687301
^θ=0 .04118174
Lindley
^θ=0.53923226
^θ=0.04958387
-2logL
AIC
BIC
AICC
195.9141
199.9141
204.2004
200.1141
220.7157
224.7157
229.0019
224.9157
256.3205
258.3205
260.4636
258.3860
266.8915
268.8915
271.0347
268.9570
242.7153
244.7153
246.8584
244.7808
From results given above in table 8.1 it has been clearly observed that the LBQG
distribution have the lesser AIC, BIC, AICC and -2logL values as compared to quasi Garima,
Garima, exponential and Lindley distributions. Hence it can be concluded that the LBQG
distribution leads to a better fit as compared over quasi Garima, Garima, exponential and
Lindley distributions.
8.9. Conclusion
This study describes a new model of two parameter Quasi Garima distribution
named as Length Biased Quasi Garima(LBQG) distribution. The subject distribution is
generated by using the length biased technique. Its various statistical properties including its
moments, harmonic mean, and moment generating function, characteristic function, order
statistics, Bonferroni and Lorenz curves have been investigated. Its parameters have also
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
127
been estimated by using the method of maximum likelihood estimation. Lastly, the real data
sets have been applied in LBQG distribution to discuss its goodness of fit and the fit of
LBQG distribution has been found good in comparison over Quasi Garima, Garima,
Exponential and Lindley distributions.
Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of
Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671
Download