A simple and efficient test for the Pareto law Supplementary material Francisco J. Goerlich Gisbert University of Valencia and Instituto Valenciano de Investigaciones Económicas (Ivie) Address: University of Valencia, Department of Economic Analysis, Campus Tarongers, Av. Tarongers s/n, 46022-Valencia. Spain. Phone: 00-34-96 382 82 53 Fax: 00-34-96 382 82 49 E-mail: Francisco.J.Goerlich@uv.es. 1 SUPPLEMENTARY MATERIAL Useful results related to the Pareto law and the MLE estimation We state some results related to the Pareto law used in the text. All of them follow from standard results on the distribution of functions of random variables (Mood, Graybill y Boes 1974, Chapter V). x Result 1: If x follows a Pareto law, then y1 log follows an exponential 1 distribution with parameter . Thus, E ( y1 ) . Result 2: If x follows a Pareto law, then y2 follows a power function x distribution with parameter (a special type of Pearson Type I distribution). Thus, E ( y2 ) and E ( y22 ) . 1 2 Result 3: If x follows a Pareto law with 1, then y2 follows a x uniform distribution on the interval [0,1]. Remark A.1. The MLE is very different from the OLS estimator of the log-log linear relationship mentioned in the introduction (Nishiyama and Osada 2004; Nishiyama, Osada and Sato 2008); although Aban and Meerschaert (2004) show that the MLE coincides with the Generalized Least Squares (GLS) estimator of this log-log linear relationship taking into account the mean and covariance structure of the extreme order statistics. Remark A.2. Using well-known results and the fact that the exponential distribution belongs to the class of gamma distributions, it is possible in this case to obtain the small sample distribution of ̂ . Therefore, we know the exact expectation and variance of the estimator (Johnson and Kotz 1970, 19.4.4). Remark A.3. We have noticed in the text that can be considered as a measure of inequality, in the sense that higher values of imply less dispersion, and in fact lower values of standard relative inequality measures, such as CV(X) or G(X). Derivation of the LMp test statistic In the sequel, we obtain the score and the information matrix of log LB and next we derive the LM statistic for H0: . The gradient of (7) is given by log LB n n xi log 1 i 1 n log LB n xi ( 1) xi i 1 2 1 (A.1) 2 Note that, by equating these equations to zero we have a system of nonlinear equations that have to be solved by numerical methods. Therefore, the MLE of the parameters for the Burr density does not have an explicit formula. The Hessian is given by log 2 LB n 2 2 2 n log LB xi xi i 1 2 1 n log 2 LB n ( xi )(2 xi ) ( 1) 2 2 2 ( xi )2 i 1 (A.2) Evaluating (A.1) at the restricted estimate of , ̂ , we have log LB 1 1 n xi n log n i 1 (A.3a) n( 1) 1 n 1 n i 1 xi (A.3b) and log LB Remark A.4. Given that the restricted estimate of is just the Pareto MLE, ̂ (5), it is worth noting that evaluating (A.3a) at ̂ is just zero, log LB 0. ˆ xi 1 1 n log , a function of , (A.3a) n i 1 measures a discrepancy between a population mean and a sample mean, since given result 1 E ( z1 ) 0 . This gives the MLE of under the Pareto law a nice moment interpretation. 1 n , a function of , (A.3b) Remark A.6. Defining z2 1 n i 1 xi measures a discrepancy between a population mean and a sample mean, since given result 2 E( z2 ) 0 . Remark A.5. Defining z1 Evaluating (A.2) at the restricted estimate of , ̂ , we have log 2 LB n 1 n 1 n i 1 xi (A.4a) 3 and log 2 LB 2 n( 1) 1 n 2 2 1 n i 1 xi2 (A.4b) changing its sign and evaluating the expectation, log 2 LB E log 2 LB E 2 n n 1 n 1 E ( 1) n i 1 xi (A.5a) n( 1) 1 n 2 n (A.5b) E 2 2 2 1 n ( 2) i 1 xi Thus, the Fisher sample information, evaluated at , is given by 1 n (, ) 2 2 ( 1) 2 ( 1) 3 2 ( 2) (A.6) and its inverse is 3 2 1 2 ( 2)( 1) 2 ( 2) 1 (, ) . 2 n ( 1) Now we can form the LM statistic. Noting that zˆ1 2 ( 1) 1 (A.7) xi 1 1 n log 0 , ˆ n i 1 the score vector evaluated at the restricted estimate is 0 zˆ1 q n (ˆ 1) n (ˆ 1) zˆ2 zˆ2 21 (A.8) ˆ 1 n . Therefore, only the (2,2) element of (A.8) matters in the ˆ 1 n i 1 xi 1 2 ( 2)( 1) 2 LM statistic, . . The feasible LM statistic is then given by n with zˆ2 LM P n (ˆ 2)(ˆ 1)4 2 asy 2 .zˆ2 ~ (1) on H0 : ˆ (A.9) as stated in the text. 4 Power tables behind the figures 2, 3 and 4 Table A.1 Power of the LMP test against the Burr distribution. Percentage of rejections, n = 1000 and asymptotic critical values H1: = .(1 + ) Significance level 10% 5% 1% 1 2 3 5 10 -0.5 100.0% 99.8% 97.5% 80.9% 42.9% -0.4 99.8% 96.7% 87.4% 63.7% 32.2% -0.3 93.7% 80.9% 65.7% 43.3% 23.0% -0.2 64.3% 49.5% 38.0% 25.6% 15.7% -0.1 25.0% 20.5% 17.2% 14.2% 11.4% 0.0 10.0% 10.0% 9.9% 10.0% 9.7% 0.1 22.5% 18.2% 15.3% 12.5% 10.3% 0.2 51.3% 39.5% 30.2% 20.7% 12.9% 0.3 77.9% 63.8% 50.4% 32.7% 17.5% 0.4 92.8% 82.9% 69.6% 47.4% 23.5% 0.5 98.4% 93.6% 84.1% 62.0% 31.1% -0.5 100.0% 99.6% 95.4% 73.4% 33.2% -0.4 99.4% 94.2% 81.1% 53.4% 23.2% -0.3 89.2% 72.5% 54.9% 33.4% 15.5% -0.2 52.9% 38.1% 27.9% 17.4% 9.6% -0.1 16.1% 12.8% 10.6% 8.3% 6.3% 0.0 5.0% 5.1% 4.9% 5.0% 4.8% 0.1 13.8% 10.5% 8.3% 6.3% 5.0% 0.2 38.2% 27.0% 19.2% 11.8% 6.5% 0.3 67.0% 50.4% 36.6% 20.9% 9.5% 0.4 87.1% 72.9% 56.3% 33.8% 13.6% 0.5 96.4% 88.0% 74.1% 47.9% 19.3% -0.5 100.0% 98.2% 88.2% 55.7% 18.1% -0.4 97.6% 85.2% 64.6% 34.0% 11.0% -0.3 75.1% 52.5% 34.4% 17.2% 6.4% -0.2 30.5% 19.6% 13.1% 7.2% 3.3% -0.1 5.7% 4.3% 3.5% 2.6% 1.7% 0.0 1.0% 1.0% 1.0% 1.0% 1.0% 0.1 4.1% 2.7% 1.8% 1.2% 0.9% 0.2 17.0% 9.8% 5.9% 2.9% 1.2% 0.3 41.6% 25.4% 14.9% 6.2% 2.0% 0.4 68.3% 47.0% 29.3% 12.7% 3.3% 0.5 86.9% 68.7% 47.8% 21.7% 5.4% Source: Own Monte Carlo simulations using the direct inversion method, 100,000 replications and = 1. 5 Table A.2 Power against the exponential distribution. Percentage of rejections. (a) LMP Significance level 1 2 3 5 10 10% 100.0% 100.0% 100.0% 99.9% 83.9% 5% 100.0% 100.0% 100.0% 99.7% 72.1% 1% 100.0% 100.0% 100.0% 97.7% 40.8% (b) Bootstraped K-S test of Clauset et al (2009) Significance level 1 2 3 5 10 10% 100.0% 100.0% 100.0% 96.9% 59.9% 5% 100.0% 100.0% 99.9% 93.0% 45.9% 1% 100.0% 100.0% 99.3% 76.2% 22.1% Source: Own Monte Carlo simulations using the direct inversion method and 100,000 replications for the LMP test. Matlab script plpva.m of Aaron Clauset and 10,000 replications for the K-S test. Table A.3 Power against the lognormal distribution. Percentage of the tail observations included in the test. Percentage of rejections (a) LMP Significance level 10% 5% 2.5% 1% 0.5% 10% 100.0% 99.2% 96.4% 88.4% 81.1% 5% 99.9% 98.0% 92.2% 79.5% 69.5% 1% 98.7% 90.3% 75.0% 53.5% 40.6% (b) Bootstraped K-S test of Clauset et al (2009) Significance level 10% 5% 2.5% 1% 0.5% 10% 98.3% 91.9% 81.8% 67.7% 58.0% 5% 95.9% 84.6% 70.1% 54.6% 44.4% 1% 84.6% 62.9% 43.3% 28.8% 20.4% Source: Own Monte Carlo simulations using the direct inversion method and 100,000 replications for the LMP test. Matlab script plpva.m of Aaron Clauset and 10,000 replications for the K-S test. References Aban IB Meerschaert MM (2004) Generalized least-squares estimators for the thickness of heavy tails. J Stat Plann Infer 119:341-352. Johnson NL Kotz S (1970) Distributions in Statistics: Continuous Univariate Distributions. Volume 1, Houghton Mifflin Company, Boston. Mood AM Graybill FA Boes DC (1974) Introduction to the Theory of Statistics. 3rd edn. International Student edn. McGraw-Hill Book Company, London. Nishiyama Y Osada S (2004) Statistical theory of rank size rule regression under Pareto distribution. Discussion Paper 009 (January), 21COE, Interfaces for Advanced Economic Analysis, Kyoto University. http://www.kier.kyoto-u.ac.jp/coe21/dp/01-10/DP009nishiyama%26oasada.pdf. Accessed September 2012 Nishiyama Y Osada S Sato Y (2008) OLS estimation and the t test revisited in rank-size rule regression. J Reg Sci 48:691-716 [Erratum in J Reg Sci 2009, 49:241]. 6