CHAPTER VIII A NEW MODIFICATION OF QUASI GARIMA DISTRIBUTION WITH PROPERTIES AND APPLICATIONS IN BIOSTATISTICS 8.1. Introduction The study introduces Length Biased Quasi Garima distribution (LBQG) distribution as a new generalization of Quasi Garima distribution. The different statistical properties of new distribution such as moments, order statistics, survival analysis, Bonferroni and Lorenz curves have been studied and investigated. The parameters of proposed new distribution are estimated by using the technique of maximum likelihood estimator and also its Fisher’s information matrix have been discussed. Finally a new distribution has been fitted with real data sets for examining its superiority. Quasi Garima (QG) distribution is a newly executed two parametric lifetime model proposed by Shanker et al. (2019) and the proposed QG distribution is a special case of one parameter Exponential and Garima distribution. Its different mathematical and statistical properties including moments and moments based measures, hazard rate function, mean residual life function, stochastic ordering, mean deviations, Bonferroni and Lorenz curves, order statistics, Renyi entropy measure and stress strength reliability have been discussed. For estimating its parameters the two methods namely the method of moments and method of maximum likelihood estimation have been used. A goodness of fit of QG distribution have also been discussed by using a real lifetime data set from engineering and the fit has been found quite satisfactory over one parameter exponential, Lindley, Garima and two parameter quasi Shanker, gamma, weibull and lognormal distributions. Shanker (2016) pointed out the Garima distribution with behavioral science applications, discuss its several Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 112 statistical properties and estimate its parameters through method of moments and method of maximum likelihood estimation. The probability density function of QGD is given by f ( x; θ , α )= θ2 θ 2 +θ+ α ( 1+θ+αx ) e− θx ; x >0 , θ> 0 , α >0 (1 ) and the cumulative distribution function of quasi Garima distribution is given by ( F( x ; θ , α )=1− 1+ α θx e − θx ; x >0 , θ> 0 , α> 0 θ +θ +α 2 ) (2) 8.2. Length Biased Quasi Garima (LBQG) Distribution The weighted distributions provide a suggestive approach in distribution theory to deal with new understanding of the existing classical distributions. The concept of weighted distributions introduced by Fisher (1934) provides a collective access for the problem of model specification and data interpretation problems. Later Rao (1965) formulized in general terms to deal with modelling statistical data when the usual practice of using classical distributions was found to be inappropriate. The weighted distributions are applied in various fields such as biomedicine, ecology, reliability, analysis of family data, Metaanalysis, analysis of intervention data and other areas for the development of proper statistical models. The weighted distributions occur in a natural way in specifying probabilities of events as observed and recorded by making adjustments to probabilities of actual occurrence of events taking into account methods of ascertainment and failure to make such adjustments can lead to wrong conclusions. When observations are recorded by an investigator in the nature according to certain stochastic model, the distribution of recorded observations will not have the original distribution unless every observation is given an equal chance of being recorded. Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 113 The weighted distributions also provides a technique in fitting a models to the unknown weight function when the samples can be taken both from original and developed distributions and it is employed to modify the probabilities of events as observed and transcribed. The weighted distributions occur in modelling clustered sampling, heterogeneity and extraneous variation in the data set. The weighted distribution reduces to length biased distribution when the weight function considers only the length of the units. The concept of length biased sampling was first introduced by Cox (1969) and Zelen (1974). The statistical interpretation of length biased distribution was originally identified by Cox (1962) in the context of renewal theory. More generally, when the sampling mechanism selects units with probability proportional to measure of the unit size, resulting distribution is called size biased. There are various good sources which provide the detailed description of length biased distributions. Various newly introduced distributions along with their length biased versions exist in literature whose statistical behaviour is extensively studied during decades. Much work on length biased distributions was published, for example; Das and Roy (2011) proposed length biased weighted Weibull distribution. Rather and Subramanian (2018) studied the length biased weighted generalized uniform distribution. Mudasir and Ahmad (2018) presented the Length Biased Nakagami distribution with properties and applications. Simmachan et al. (2018) discussed a new lifetime distribution based on the re-parameterizations model called two-sided length-biased inverse Gaussian distribution (TSLBIGD). Kersey and Oluyede (2012) pointed out the length biased inverse weibull distribution. Rajagopalan et al. (2019) studied the length biased Aradhana distribution with applications. Subramanian and Shenbagaraja (2020) discussed on the length biased quasi Sujatha distribution with properties and applications. Ayesha (2017) presented the size-biased Lindley distribution with its statistical properties and applications. Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 114 Shanker and Shukla (2018) discussed on a generalized size-biased poisson-Lindley distribution and its application to model size distribution of freelyforming small groups. Modi and Gill (2015) proposed the length biased weighted Maxwell distribution. Rather and Subramanian (2018) studied the length biased Sushila distribution with applications. Recently, Ganaie and Rajagopalan (2020) discussed on the length biased two parameter Pranav distribution with characterizations and its applications. Suppose the non-negative random variable X has probability density function f (x ) Let its non-negative weight function be w (x ) , then the probability density function of weighted random variable f w ( x )= w( x )f (x ) E(w ( x )) , X w is given by x>0 . Where the non-negative weight function be w( x) and E( w( x))=∫ w( x )f ( x)dx<∞ . Different choices of the weight function w(x), weighted models are of various forms particularly when w(x)= xc, the study is called weighted distribution. In this paper, we have to study the LBQG distribution, so for obtaining the length biased version of quasi Garima distribution, we will take consequently w(x) = x, then the probability density function of length biased distribution is given by f l( x) = xf ( x ) E( x ) (3) ∞ Where E( x) = ∫ xf ( x ; θ,α )dx 0 Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 115 ( θ+θ2 + 2 α ) E ( x )= θ( θ 2 +θ+ α ) (4) Substitute equations (1) and (4) in equation (3), we will get the probability density function of LBQG distribution 3 f l ( x )= θ x ( 1+θ+ αx ) e− θx 2 θ+θ +2 α ( 5) and the cumulative distribution function of LBQG distribution can be obtained as x Fl ( x )=∫ f l ( x )dx 0 x θ3 Fl ( x )=∫ (1+θ +αx )xe−θx dx 2 0 θ+θ +2 α x θ3 Fl ( x )= (1+θ+αx )x e−θx dx ∫ 2 θ+θ +2 α 0 θ3 Fl ( x )= θ +θ 2 +2 α x (∫ 0 x −θx xe dx + θ ∫ xe Put θx = t ⇒ θ dx=dt ⇒ dx= Also x= x −θx 0 dt , θ 2 −θx dx +α ∫ x e 0 dx ) (6) When x→ x , t→θx and when x →0 , t →0 t θ After simplification of equation (6), we will obtain the cumulative distribution function of LBQG distribution Fl ( x )= 1 ( θ γ ( 2, θx )+θ2 γ ( 2 , θx )+ αγ (3 , θx ) ) 2 θ+θ + 2 α (7 ) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 116 Figure 8.1pdf plot of LBQG distributio Figure 8.2 cdf plot of LBQG distribution 8.3. Survival Analysis In this section, derive the survival function, hazard rate and reverse hazard rate functions of the LBQG distribution. a). Survival function Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 117 The survival function is defined as the probability that a system survives beyond a specified time and is also known as compliment of the cumulative distribution function. The survival function or reliability function of LBQG distribution can be obtained as S ( x)=1−Fl (x ) S ( x )=1− 1 ( θ γ (2 , θx )+θ2 γ (2 , θx )+α γ (3,θx )) 2 θ +θ +2α b). Hazard function The hazard function is also known as instantaneous failure rate or force of mortality and is given by h( x )= fl( x) 1−F l ( x ) h( x )= xθ3 (1+θ+αx )e−θx (θ +θ2 +2 α )−(θ γ (2 , θx )+θ2 γ(2 , θx )+α γ (3 , θx )) c). Reverse hazard function The reverse hazard function of LBQG distribution is given by hr ( x )= f l( x ) Fl( x ) xθ3 (1+θ+αx )e−θx hr ( x )= (θ γ(2 , θx )+θ2 γ (2 , θx )+αγ (3 , θx )) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 118 Figure 8.3 survival polt of LBQG distribution Figure 8.4 Hazard polt of LBQG distribution 8.4. Structural Measures In this section, various statistical properties of LBQG distribution have been investigated including its moments, harmonic mean, moment generating function and characteristic function. 8.4.1 Moments Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 119 Suppose the random variable X represents LBQG distribution with parameters θ and α, then the rth order moment E(X r) of LBQG distribution can be obtained as ∞ r E( X )= μ r ' =∫ x r f l ( x)dx 0 ∞ 3 θ E( X )= μr ' =∫ x x(1+θ+αx)e−θx dx 2 θ +θ +2 α 0 r r 3 ∞ θ E( X )=μr ' = x r + 1 (1+θ+αx )e−θx dx ∫ 2 θ+θ +2α 0 ∞ ∞ ∞ 3 θ ( r+2 ) − 1 −θx ( r+2) − 1 −θx r E( X )= μr ' = x e dx +θ∫ x e dx+ α ∫ x (r +3 ) − 1 e−θx dx ∫ 2 θ+θ +2 α 0 0 0 r ( ) (8) After simplification of equation (8), we obtain θ Γ ( r +2)+θ 2 Γ ( r +2)+ α Γ ( r +3 ) E( X ) = μ r ' = θ r ( θ+ θ2 +2 α ) r (9 ) By putting r = 1, 2, 3 and 4 in equation (9), we will get the first four moments of LBQG distribution. 2 2 θ+2 θ +6 α E( X ) =μ1 ' = θ (θ +θ2 +2 α) 2 6 θ+6 θ +24 α E( X )= μ 2 ' = 2 θ (θ+θ 2 +2 α ) 2 2 24 θ+24 θ +120 α E( X )= μ3 ' = 3 θ (θ +θ 2 +2 α ) 3 2 120 θ+120 θ +720 α E( X )= μ4 ' = θ 4 (θ +θ 2 +2 α ) 4 Variance = 6 θ +6 θ2 + 24 α 2θ+2 θ2 +6 α − θ 2 (θ+θ 2 +2 α ) θ(θ+θ2 +2 α ) ( 2 ) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 120 6 θ +6 θ2 +24 α 2 θ+2 θ2 +6 α S . D(σ )= 2 − 2 2 θ (θ +θ +2 α ) θ(θ+θ +2 α ) √ ( 2 ) 8.4.2 Harmonic mean The harmonic mean of proposed LBQG distribution can be obtained as 1 ∞1 H . M=E =∫ f l ( x )dx x 0 x () 3 ∞ θ H . M=∫ (1+θ+αx )e−θx dx 2 0 θ+θ +2 α ∞ 3 θ H . M= (1+θ+αx ) e−θx dx ∫ 2 θ+θ +2 α 0 3 θ H . M= 2 θ+θ +2 α ( ∞ ∞ ∫e −θx 0 dx + θ∫ e ∞ −θx 0 dx + α ∫ xe−θx dx 0 ) After simplification of above equation, we obtain HM = θ3 1 α +1+ 2 2 θ θ+ θ +2 α θ HM = θ ( θ 2+ θ+α ) θ+θ2 +2 α ( ) 8.4.3 Moment Generating Function and Characteristic Function Moment generating function is another alternative specification in probability theory and statistics for finding the moments of a distribution. Let X be a random variable following LBQG distribution, then the MGF of X can be obtained as tx ∞ M X (t )=E (e )=∫ etx f l ( x)dx 0 Using Taylor series, we get ∞ ( M X (t )=∫ 1+tx+ 0 (tx )2 +.. .. f l ( x )dx 2! ) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 121 j ∞ ∞ M X (t )=∫ ∑ 0 j=0 ∞ M X (t )= ∑ j=0 ∞ M X (t )= ∑ j=0 t j x f ( x)dx j! l tj μ' j! j t j θ Γ ( j+2)+θ 2 Γ ( j+2 )+α Γ ( j+3) j 2 j! θ (θ +θ +2 α ) ( ) j ∞ 1 t M X (t )= ( θ Γ ( j+2)+θ2 Γ ( j+2)+α Γ ( j+3)) ∑ 2 j (θ+θ +2 α ) j=0 j!θ Similarly, the characteristic function of LBQG distribution can be obtained as ϕ X (t )=M X (it ) j ∞ 1 it M X (it )= ( θ Γ ( j+2 )+θ 2 Γ ( j+2 )+α Γ ( j+3 )) ∑ 2 j (θ+θ +2α ) j=0 j !θ 8.5. Order Statistics Order statistics is a useful concept in statistical sciences and have wide range of applications in modeling auctions, car races and insurance policies. Suppose X(1), X(2) ,…, X(n) denote the order statistics of a random sample X1, X2,…, Xn drawn from a continuous population with probability density function fx(x) and cumulative distribution function FX(x), then the probability density function of rth order statistics X(r) is given by f x ( r) ( x )= n! f X ( x )( F X ( x) )r−1 ( 1−F X ( x) ) n−r (r−1)!(n−r )! By substituting equations (5) and (7) in equation (10), we will get the probability density function of rth order statistics of LBQG distribution. Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 (10) 122 n! θ3 f x(r) ( x )= x (1+θ +αx )e−θx 2 (r−1 )!(n−r)! θ +θ +2α ( ) r−1 1 2 × (θ γ(2 , θx )+θ γ (2, θx )+α γ (3 , θx )) θ +θ2 +2α ( ) ( × 1− n−r 1 2 ( θ γ (2, θx )+θ γ(2, θx )+α γ (3 , θx) ) θ +θ 2 +2α ) Therefore, the probability density function of higher order statistic X(n) of LBQG distribution can be obtained as nθ3 f x( n) (x )= x (1+θ+αx)e−θx 2 θ +θ +2 α n−1 1 2 × (θ γ(2 , θx )+θ γ (2, θx )+α γ(3 , θx )) θ +θ2 +2 α ( ) and the probability density function of first order statistic X(1) of LBQG distribution can be obtained as f x ( 1) ( x )= ( × 1− nθ 3 x(1+θ+αx)e−θx 2 θ +θ +2 α n−1 1 2 ( θ γ(2 , θx )+θ γ (2 , θx )+α γ (3 , θx ) ) θ +θ2 +2 α ) 8.6. Bonferroni and Lorenz Curves The bonferroni and lorenz curves are also known as income distribution curves and are used in economics to study the distribution of inequality in income or poverty. Nowadays it is also being used in various other fields like reliability, medicine, insurance and demography. The bonferroni and lorenz curves are defined as q 1 B ( p )= ∫ xf ( x )dx pμ1 ' 0 Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 123 q 1 L( p )= pB( p )= ∫ xf (x )dx μ1 ' 0 2 2 θ+2θ +6 α Where μ 1 ' = θ(θ +θ2 +2 α ) and q=F −1 ( p ) q θ(θ +θ2 + 2α ) θ3 B ( p )= x 2 (1+θ+ αx)e−θx dx ∫ 2 2 p(2 θ+2 θ +6 α ) 0 θ + θ +2 α q θ4 B ( p )= x 2 (1+ θ+αx )e−θx dx ∫ 2 p(2 θ+2 θ +6 α) 0 θ4 B ( p )= p(2 θ+2 θ2 +6 α) ( q ∫x 0 q 3 − 1 −θx e q dx+θ ∫ x 3 − 1 −θx 0 e dx +α ∫ x 4 − 1 −θx 0 e dx ) After simplification of above equation, we get 4 θ B ( p )= ( γ (3 , θq )+θ γ (3 , θq )+α γ( 4 , θq ) ) p(2 θ+2 θ2 +6 α) 4 θ L( p )= ( γ(3 , θq)+θ γ(3 , θq)+α γ (4 , θq ) ) (2 θ +2θ2 +6 α ) 8.7 Maximum Likelihood Estimation and Fisher’s Information Matrix In this section, we will discuss the parameter estimation of LBQG distribution by using the technique of maximum likelihood estimator and also derive its Fisher’s information matrix. Suppose the random sample X1, X2,….,Xn of size n drawn from the LBQG distribution, then the likelihood function can be written as n L( x )=∏ f l ( x ) i=1 n L( x )=∏ i=1 ( −θx θ3 i x (1+θ+ αx )e i i 2 (θ +θ + 2α ) ) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 124 n −θx θ3 n L( x )= x (1+θ+αxi )e i 2 n∏ i (θ + θ +2 α ) i=1 ( ) The log likelihood function is given by n n n log L=3 n log θ−n log ( θ+θ 2 +2 α )+ ∑ log x i + ∑ log ( 1+θ+αx i )−θ ∑ xi i=1 i=1 ( 11 ) i=1 Now differentiating the above equation (11) with respect to parameters θ and α. We obtain the normal equations as n n ∂log L 3 n 2θ 1 = −n +∑ −∑ x =0 2 ∂θ θ ( θ+θ +2 α ) i=1 ( 1+θ+αx i ) i=1 i ) ( ( ) n xi ∂log L 2 =−n + =0 ∑ 2 ∂α ( θ +θ +2 α ) i=1 (1+θ+ αx i ) ) ( ( ) The above system of nonlinear equations are too complicated to solve it algebraically, therefore we use numerical technique like Newton-Raphson method for estimating the required parameters of the proposed distribution. We use the asymptotic normality results to obtain confidence interval, we have that ^λ=( θ^ , α^ ) denotes the MLE of λ=(θ , α ). We can state the results as follows . if √ n( ^λ−λ )→N 2(0 , I−1 ( λ )) Where I −1( λ ) is Fisher's information matrix . i . e., 2 ∂ 1 E logL I( λ)=− ¿ ∂θ 2 n ∂2 log L E ¿¿ ∂θ ∂α ¿ ¿ ( ( ) ( )) Where E ( ∂2 log L ∂θ 2 ) =− 2(θ+θ2 +2 α )− 4 θ2 n 3n 1 −n − ∑ 2 2 2 θ2 (θ +θ +2 α ) i=1 (1+θ +αx i ) ( ) ( ) Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 125 E E ∂2 log L n E ( xi ) 4 = −n −∑ (θ +θ 2 +2α )2 i=1 (1+θ+αxi )2 ( ) ( ∂ α2 ∂2 log L ( ∂ θ ∂α ) ( =n ) ( 2 n E( x i ) 4θ − ∑ ( θ+θ 2 +2 α )2 i =1 ( 1+θ+ αx i )2 ) ( ) ) Since λ being unknown, we estimate I −1 ( λ) by I −1 ( ^λ) and this can be used to obtain asymptotic confidence intervals for θ and α . 8.8 Data Analysis In this section, discussion was made on the goodness of fit by analysing real data sets in LBQG distribution to show that the LBQG distribution fits better as compared to quasi Garima, Garima, exponential and Lindley distributions. 603 subjects were randomly selected from various hospitals in the two districts, Palakkad and Malappuram - at Kerala to make real data analysis. R software is employed to estimate the unknown parameters along with the model comparison criterion values. In order to compare the LBQG distribution with quasi Garima, Garima, exponential and Lindley distributions, we apply the AIC (Akaike Information Criterion), AICC (Akaike Information Criterion Corrected), BIC (Bayesian Information Criterion) and -2logL. The better distribution is which corresponds to lower values of AIC, BIC, AICC and -2logL. For calculating AIC, BIC, AICC and -2logL can be evaluated by using the formulas as follows: AIC=2k −2 log L, BIC=k log n−2 log L and AICC= AIC + 2 k ( k +1) n−k−1 Where k is the number of parameters in the statistical model, n is the sample size and –2logL is the maximized value of log-likelihood function under the considered model. Table 8.1: Comparison of fitted distributions Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 126 Distribution MLE α^ =6.260663 ^θ=9.805593 LBQG Quasi Garima α^ =3355.054889 ^θ=0.653618 S.E α^ =1.677726 ^θ=7.132346 α^ =0.0102445 ^θ=0.000000 Garima ^θ=0.47847006 ^θ=0.05103353 Exponential ^θ=0.32687301 ^θ=0 .04118174 Lindley ^θ=0.53923226 ^θ=0.04958387 -2logL AIC BIC AICC 195.9141 199.9141 204.2004 200.1141 220.7157 224.7157 229.0019 224.9157 256.3205 258.3205 260.4636 258.3860 266.8915 268.8915 271.0347 268.9570 242.7153 244.7153 246.8584 244.7808 From results given above in table 8.1 it has been clearly observed that the LBQG distribution have the lesser AIC, BIC, AICC and -2logL values as compared to quasi Garima, Garima, exponential and Lindley distributions. Hence it can be concluded that the LBQG distribution leads to a better fit as compared over quasi Garima, Garima, exponential and Lindley distributions. 8.9. Conclusion This study describes a new model of two parameter Quasi Garima distribution named as Length Biased Quasi Garima(LBQG) distribution. The subject distribution is generated by using the length biased technique. Its various statistical properties including its moments, harmonic mean, and moment generating function, characteristic function, order statistics, Bonferroni and Lorenz curves have been investigated. Its parameters have also Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671 127 been estimated by using the method of maximum likelihood estimation. Lastly, the real data sets have been applied in LBQG distribution to discuss its goodness of fit and the fit of LBQG distribution has been found good in comparison over Quasi Garima, Garima, Exponential and Lindley distributions. Communicated and waiting for acceptance with Sankhya A (SANK) The Indian Journal of Statistics - Official Journal of Indian Statistical Institute ISSN:0972-7671