Statistical Distributions BYU James B. McDonald Statistical Distributions James B. McDonald Brigham Young University May 2013 The research assistance of Brad Larsen, Patrick Turley, and Sean Kerman is gratefully acknowledged as are comments from Richard Michelfelder and Panayiotis Theodossiou. Statistical Distributions 1. 2. 3. 4. 5. 6. 7. 8. Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Statistical Distributions Introduction Some families of statistical distributions 1. 2. a. 3. 4. 5. 6. 7. 8. Families Regression applications Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Some families of statistical distributions Families a. i. f(y;θ), θ = vector of parameters GB: GB1, GB2, GG (0<Y) GB distribution tree Probability Density Functions GB y; a, b, c, p, q 1 1 c y / b B p, q 1 c y / b ay b ap a q 1 ap 1 GB1 y; a, b, p, q GB y; a, b, c 0, p, q a pq ay , 0 y a ba / 1 c ap 1 1 y / b bap B p, q a q 1 Probability Density Functions GB2 y; a, b, p, q GB y; a, b, c 1, p, q GG y; a, , p a y ap 1 y / e ap p a controls peakedness b is a scale parameter c domain 0 y a b a / 1 c p, q shape parameters a a y ap 1 b B p, q 1 y / b ap a pq Probability Density Functions GB2 PDF evaluated at different parameter values: Some families of statistical distributions Families a. i. ii. GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG (Y is real valued) EGB distribution tree Probability Density Functions EGB y; m, , c, p, q for -< y-m e 1 1 c e B p, q 1 ce p y m / y m / y m / 1 n 1 c EGB1 y; m, , p, q e p y m / 1 e y m / B p, q q 1 q 1 pq Probability Density Functions EGB2 y; m, , p, q EGG y; m, , p m controls location is a scale parameter c defines the domain p, q are shape parameters e p y m / B p, q 1 e e p y m / e y m / e p y m / pq Probability Density Functions EGB2 PDF evaluated at different parameter values: Some families of statistical distributions Families a. i. ii. iii. GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG SGT (Skewed generalized t): SGED, GT, ST, t, normal (Y is real valued) SGT distribution tree 5 parameter SGT λ=0 q→∞ 4 parameter SGED p=1 GT SLaplace λ=0 ST λ=0 λ=0 q→∞ 3 parameter p=2 q→∞ p=2 GED p=2 SNormal p=1 p=2 λ=0 q=1/2 t q→∞ SCauchy q=1/2 λ=0 p→∞ 2 parameter Laplace Uniform Normal Cauchy Probability Density Functions SGT y ; m, , , p, q p 2 q1/ p B 1 / p, q 1 SGED y; m, , , p pe 1 sign y m y m / 1 sign y m p ym p 2 1/ p m = mode (location parameter ) = scale 1 skewness area to left of m , -1 < < 1 2 p p, q shape parameters tail thickness, moments of order pq df p q p q 1/ p Probability Density Functions SGT PDF evaluated at different parameter values: Some families of statistical distributions Families a. i. ii. iii. iv. GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG SGT (Skewed generalized t): SGED, GT, ST, t, normal IHS Probability Density Functions IHS Y a b sinh N 0,1 / k IHS y; , , k , ke k2 2 2 ln y / 2 y / 2 ln 2 2 y / 2 2 where 1/ w , w / w , w .5 e e e .5k 2 mean 2 variance skewness parameter k tail thickness N y; , limk IHS y; , , k, 0 2 2 , and w .5 e 2 k 2 e 2 k 2 2 e 1 .5 k 2 .5 Probability Density Functions IHS PDF evaluated at different parameter values: Some families of statistical distributions Families a. i. ii. iii. iv. v. GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG SGT (Skewed generalized t): SGED, GT, ST, t, normal IHS g-and-h distribution (Y is real valued) g-and-h distribution Definition: e 1 hZ 2 / 2 Yg ,h Z a b e g gZ where Z ~ N[0,1] h>0 h<0 g-and-h distribution Y0,0 Z a bZ ~ N a, 2 b 2 e gZ 1 Yg ,h0 Z a b g Yg 0,h Z a bZe gZ 2 /2 Is known as the g distribution where the parameter g allows for skewness. Is known as the h distribution • Symmetric • Allows for thick tails Probability Density Functions g-and-h PDF evaluated at different parameter values with h>0: Probability Density Functions g-and-h PDF evaluated at different parameter values with h<0: Some families of statistical distributions Families a. i. ii. iii. iv. v. vi. f(y;θ) GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG SGT (Skewed generalized t): SGED, GT, ST, t, normal IHS g-and-h distribution Other distributions: extreme value, Pearson family, … Some families of statistical distributions Families a. i. ii. iii. iv. v. vi. vii. f(y;θ) GB: GB1, GB2, GG EGB: EGB1, EGB2, EGG SGT (Skewed generalized t): SGED, GT, ST, t, normal IHS g- and h-distribution Other distributions: extreme value, Pearson family, … Extensions: 1. x , 2. Multivariate Statistical Distributions Introduction Some families of statistical distributions 1. 2. a. b. 3. 4. 5. 6. 7. 8. Families Properties Regression applications Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Some families of statistical distributions Properties b. Moments i. 1. GB family h b B p h / a, q p h / a, h / a; c h EGB Y 2 F1 p q h / a ; B p, q for h < aq with c=1 Some families of statistical distributions Properties b. i. Moments 1. GB family a. GB1 b B p h / a, q EGB1 Y B p, q h h Some families of statistical distributions Properties b. i. Moments 1. GB family a. GB1 b. GB2 b B p h / a, q h / a EGB 2 Y - p h/a q B p, q h h Some families of statistical distributions Properties b. i. Moments 1. GB family a. GB1 b. GB2 c. GG EGG Y h h p h / a p for h / a p Some families of statistical distributions Properties b. i. Moments 1. GB family 2. EGB family t e B p t , q p t , ty M EGB t E e 2 F1 B p, q p+q+t for t q / σ with c 1 t ; c EGB moments EGG EGB1 EGB2 p p p q p q Variance 2 ' p 2 ' p ' p q 2 ' p ' q Skewness 3 '' p 3 '' p '' p q 3 '' p '' q Excess kurtosis 4 ''' p 4 ''' p ''' p q 4 ''' p ''' q Mean s d n s ds EGB2 moment space Some families of statistical distributions Properties b. i. Moments 1. GB family 2. EGB family 3. SGT family SGT family ESGT y m h h/ p h 1 h q B , q p h p 1 h 1 1 h 1 h 1 2 1 B ,q p for h < pq=d.f. ESGED y m h h 1 h p h 1 h h 1 1 1 1 2 1 p SGT moment space SGT family moment space Some families of statistical distributions Families Properties a. b. i. Moments 1. GB family 2. EGB family 3. SGT family 4. IHS IHS moment space Some families of statistical distributions Families Properties a. b. i. Moments 1. GB family 2. EGB family 3. SGT family 4. IHS 5. g-and-h family g- and h-family i j g 2 21ih i 1 e n j n n i i j 0 n E X g , h a b g i 1 ih i i i j Moments exist up to order 1/h (0<h) g-and-h moment space (h>0) (visually equivalent to the IHS) Moment space for g-and-h (h>0) and g-and-h (h real) Moment space of SGT, EGB2, IHS, and g-and-h Some families of statistical distributions Properties b. Moments Cumulative distribution functions (see appendix) i. ii. • Involve the incomplete gamma and beta functions Some families of statistical distributions Properties b. Moments Cumulative distribution functions (see appendix) i. ii. • iii. Involve the incomplete gamma and beta functions Gini coefficients (G) Gini Coefficients (G) Definition: 1 G 0 0 x y f x : f y : dxdy 2 1 F y dy G 1 1 F y dy 0 0 G 2 (Dorfman, 1979, RESTAT) Gini Coefficients Interpretation: G = 2A Gini Coefficients Application: Stochastic Dominance Measures of income and wealth inequality Some families of statistical distributions Properties b. i. ii. iii. iv. Moments Cumulative distribution functions (see appendix) Gini coefficients (G) Incomplete moments Incomplete moments y Definition: y; h s h f s ds E Y h Applications: Option pricing formulas Lorenz Curves Incomplete moments Convenient theoretical results: Distribution y; h LN LN y; h 2 , 2 GG GG y; a, , p h / a GB2 GB 2 y; a, b, p h / a, q h / a Some families of statistical distributions Properties b. i. ii. iii. iv. v. Moments Cumulative distribution functions (see appendix) Gini coefficients (G) Incomplete moments Mixture models Mixture Models Let f y; , denote a structural or conditional density of the random variable Y where and denote vectors of distributional parameters. Let the density of be given by the mixing distribution g ; . The observed or mixed distribution can be written as h y; , f y; , g ; d Mixture Models Observed model Structural model Mixing distribution SGT y; m, , , p, q SGED y; m, , s, p IGG s; p, q1/ p , q GT y; , p, q GED y; s, p IGG s; p, q1/ p , q EGB2 y; , , p, q EGG y; ,ln s , p 1 IGG s; , e , q GB2 y; a, b, p, q GG y; a, s, p IGG s; a, b, q LT y; , , q LN y; , s IGG s; a 1, q1/ 2 N y; , s IGA s; q1/ 2 t y; , q Some families of statistical distributions Properties b. i. ii. iii. iv. v. vi. Moments Cumulative distribution functions (see appendix) Gini coefficients (G) Incomplete moments Mixture models Hazard functions (Duration dependence) Hazard functions Definition: Let f s denote the pdf of a spell (S) or duration of an event. 1 F s is the probability that that S>s. The corresponding hazard function is defined by f s h( s) 1 F s which can be thought of as representing the rate or likelihood that a spell will be completed after surviving s periods. Hazard functions Applications: Does the probability of ending a strike, unemployment spell, expansion, or stock run depend on the length of the strike, unemployment spell, or of the run? With unemployment, A job seeker might lower their reservation wage and become more likely to find a job Increasing hazard function However, if being out of work is a signal of damaged goods, the longer they are out of work might decrease employment opportunities Decreasing hazard function. An alternative example might deal with attempts to model the time between stock trades. Engle and Russell (1998) Autoregressive conditional duration: a new model for irregularly spaced transaction data. Econometrica 66: 1127-1162 Hazard function of time between trades is decreasing as t increases or the longer the time between trades the less likely the next trade will occur. Hazard functions Applications: Bubbles McQueen and Thorley (1994) Bubbles, stock returns, and duration dependence. Journal of Financial and Quantitative Analysis, 29:379-401 Efficient markets hypothesis, stock runs should not exhibit duration dependence (constant hazard function) McQueen and Thorley argue that asset prices may contain “bubbles” which grow each period until they “burst” causing the stock market to crash. Hence, bubbles cause runs of positive stock returns to exhibit duration dependence—the longer the run the less likely it will end (decreasing hazard function), but runs of negative stock returns exhibit no duration dependence Grimshaw, McDonald, McQueen, and Thorley. 2005, Communications in Statistics—Simulation and Computation, 34: 451-463. What model should we use to characterize duration dependence? Exponential—constant Gamma—the hazard function can increase, decrease, or be constant Weibull—the hazard function can increase, decrease, or be constant Generalized Gamma: the hazard function can be increasing, decreasing, constant, -shaped, or -shaped Hazard functions Possible shapes for the GG hazard functions Statistical Distributions Introduction Some families of statistical distributions 1. 2. a. b. c. 3. 4. 5. 6. 7. 8. Families Properties Model selection Regression applications Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Some families of statistical distributions Model selection c. i. • Goodness of fit statistics Log-likelihood values n o n f yi : for individual data i 1 o n n! ni n pi n ni g for grouped data i 1 Partition the data into g groups, Ii Yi 1, Yi , i 1,2,..., g Empirical frequency: pi ni / n, n Theoretical frequency: pi g n i 1 i f y; dy Ii Model Selection Goodness of fit statistics i. • • Log-likelihood values Possible Measures g SAE pi pi i 1 g SSE pi pi 2 i 1 2 ni 2 n pi / pi ~ 2 g # parameters 1 i 1 n g Model Selection Goodness of fit statistics i. • Log-likelihood values Possible Measures Akaike Information Criterion (AIC) • • AIC 2 k • • A tool for model selection Attaches a penalty to over-fitting a model Model Selection i. ii. Goodness of fit statistics Testing nested models HO : g 0 Examples: 1. HO : SGT GT HO : 0 2. HO : SGT Normal HO : p 2, 0, and q Testing nested models Likelihood ratio tests LR 2 * ~a 2 r where r denotes the number of independent restrictions LR1 2 LR2 2 Wald test SGT SGT a 2 * ~ 1 GT a 2 * ~ 3 Normal W g MLE ' var g MLE W1 ˆ 0 Var ˆ 1 1 g MLE ~ a 2 r ˆ 0 ~a 2 1 Statistical Distributions Introduction Some families of statistical distributions 1. 2. a. b. c. d. 3. 4. 5. 6. 7. Families Properties Model selection An example: the distribution of stock returns Regression applications Qualitative response models Option pricing VaR (value at risk) Conclusion An example: the distribution of stock returns yt n Pt / Pt 1 ~ Pt 1 Pt Pt 1 1 Pt Pt Daily, weekly, and monthly excess returns (1/2/2002 – 12/29/2006) from CRSP database (NYSE, AMEX, and NASDAQ)— 4547 companies H0: skewness = 0 CI.95 2 6 / n , 2 6 / n H0: excess kurtosis = 0 CI.95 2 24 / n , 2 24 / n H0: returns ~ N(μ, σ2) CI.95 0 JB 5.99 skew2 excess kurtosis 2 2 ~ .05 2 5.99 JB = n 24 6 An example: the distribution of stock returns (continued) % of stocks for which excess returns statistics are in 95% C.I. HO: Skewness=0 HO:Excess kurtosis=0 HO: Normal Daily 16.38% 0.04% 0.09% Weekly 30.61% 4.88% 4.75% Monthly 66.79% 56.65% 53.77% An example: the distribution of stock returns (continued) Daily excess returns plotted with admissible moment space of flexible distributions CRSP daily stocks--excess returns 60 CRSP stock EGB2 SGT IHS bound 50 Kurtosis 40 30 20 10 0 -4 -3 -2 -1 0 Skewness 1 2 3 4 An example: the distribution of stock returns (continued) Weekly excess returns plotted with admissible moment space of flexible distributions CRSP weekly stocks--excess returns 60 CRSP stock EGB2 SGT IHS bound 50 Kurtosis 40 30 20 10 0 -4 -3 -2 -1 0 Skewness 1 2 3 4 An example: the distribution of stock returns (continued) Monthly excess returns plotted with admissible moment space of flexible distributions CRSP monthly stocks--excess returns 60 CRSP stock EGB2 SGT IHS bound 50 Kurtosis 40 30 20 10 0 -4 -3 -2 -1 0 Skewness 1 2 3 4 An example: the distribution of stock returns (continued) Fraction of stocks in the admissible skewness-kurtosis space daily weekly monthly EGB2 15.48% 43.81% 50.80% IHS 83.92% 84.39% 61.97% SGT 87.62% 89.00% 95.10% g-and-h 100.00% 99.98% 98.99% An example: the distribution of stock returns (continued) Fitting a PDF to normal excess returns Estimated PDFs for US Steel daily excess returns Company Name Skew US Steel 0.06 Kurtosis Jb Stat 3.308 SSE 20 5.62 Estimated PDF logL SAE Chi^2 Normal 2753.52 0.001 0.12 27.81 EGB2 2756.83 0.001 0.11 23.38 IHS 2756.76 0.001 0.11 23.46 SGT 2758.78 0.001 0.12 28.19 Returns Normal EGB2 IHS SGT 18 16 14 12 10 8 6 4 2 0 -0.15 -0.1 -0.05 0 0.05 Excess returns 0.1 0.15 0.2 An example: the distribution of stock returns (continued) Fitting a PDF to leptokurtic excess returns Estimated PDFs for iShares daily excess returns Company Name Skew Kurtosis 50 Jb Stat iShares -29.06 965.09 48733899.02 Estimated PDF logL Normal 2516.86 0.099 0.93 1433.33 EGB2 3713.99 0.002 0.13 43.47 IHS 3795.21 0.001 0.12 33.43 SGT 3810.07 0.003 0.21 79.35 SSE SAE Chi^2 Returns Normal EGB2 IHS SGT 45 40 35 30 25 20 15 10 5 0 -0.06 -0.04 -0.02 0 0.02 Excess returns 0.04 0.06 0.08 Statistical Distributions 1. 2. 3. 4. 5. 6. 7. 8. Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Statistical Distributions 1. 2. 3. Introduction Some families of statistical distributions Regression applications a. 4. 5. 6. 7. 8. Background Censored regression models Qualitative response models Option pricing VaR (value at risk) Conclusion Regression applications-background Model: Yt X t t X t 1xK vector of observations on the explanatory variables Kx1 vector of unknown coefficients t independently and identically distributed random disturbances with pdf f ; Regression applications-background If the errors are normally distributed OLS will be unbiased and minimum variance However, if the errors are not normally distributed OLS will still be BLUE There may be more efficient nonlinear estimators Statistical Distributions 1. 2. 3. Introduction Some families of statistical distributions Regression applications a. b. 4. 5. 6. 7. 8. Background Alternative estimators Censored regression Qualitative response models Option pricing VaR (value at risk) Conclusion Alternative Estimators i. Estimation OLS n OLS arg min t Y X t t 1 LAD n LAD arg min t Y X t t 1 Lp n L arg min t Y X t p t 1 p 2 Alternative Estimators (continued) i. Estimation (continued) M-estimators: n M arg min t Y X t t 1 Includes OLS, LAD, and Lp as special cases Includes MLE (QMLE or partially adaptive estimators) as a special case where ; n f ; n MLE arg min , t Y X t ; t 1 SGT SGED EGB2 IHS Alternative Estimators (continued) i. ii. Estimation Influence functions: OLS ( ) '( ) LAD Redescending influence function Alternative Estimators (continued) i. ii. iii. Estimation Influence functions Asymptotic distribution of extremum estimators min H ˆ ~ a N ; sandwich A1 BA1 where d 2 H dH dH A E and B E d d ' d d ' Alternative Estimators (continued) i. ii. iii. iv. Estimation Influence functions Asymptotic distribution of extremum estimators Other estimators Semiparametric (Kernel estimator, Adaptive MLE) n SP arg min n f K t Y X t t 1 where 1 n ei f K K nh h i 1 ei Yi X i OLS K denotes a kernel, and h is the window width Regression applications (continued) iv. Other estimators (continued) Generalized Method of Moments (GMM) GMM arg min g ' Qg where n g Zi h i Yi X i i 1 Z denotes a vector of instruments (can be X) Q is a positive definite matrix Q Var 1 ( g ) Statistical Distributions 1. 2. 3. Introduction Some families of statistical distributions Regression applications a. b. c. 4. 5. 6. 7. 8. Background Alternative estimators A Monte Carlo comparison of alternative estimators Censored regression models Qualitative response models Option pricing VaR (value at risk) Conclusion A Monte Carlo comparison of alternative estimators c. A Monte Carlo comparison of alternative estimators Model: yt 1 X t t Error distributions: (zero mean and unitary variance) Normal: N 0;1 Mixture: .9* N 0,1/ 9 .1* N 0,9 Skewness =0 Kurtosis =24.3 Skewed: LN 0,1 e / e e 1 Skewness=6.18 Kurtosis=113.9 .5 A Monte Carlo comparison of alternative estimators Kurtosis Skewed Mixture Normal Skewness A Monte Carlo comparison of alternative estimators Sample size = 50, T=1000 replications RMSE for slope estimators Estimators Normal Mixture-thick tails Skewed OLS .275 .287 .280 LAD .332 .122 .159 SGED .335 .128 .060 ST .293 .112 .054 GT .314 .133 .135 SGT .335 .125 .073 EGB2 .287 .125 .049 IHS .285 .119 .054 SP = AML .285 .114 .128 GMM .319 .115 .088 Statistical Distributions 1. 2. 3. Introduction Some families of statistical distributions Regression applications a. b. c. d. Background Alternative estimators A Monte Carlo comparison of alternative estimators An application: CAPM i. Error distribution effects ii. ARCH effects 4. 5. 6. 7. Censored regression Qualitative response models Option pricing VaR (value at risk) An application: CAPM i. CAPM and the error distribution Daily, weekly, and monthly excess returns (1/2/2002 – 12/29/2006) from CRSP database (NYSE, AMEX, and NASDAQ)— 4547 companies Percent of stocks for which OLS residual statistics are in 95% C.I. HO: Skewness=0 HO:Excess kurtosis=0 HO: Normal (JB) Daily 14.14% 0.02% 0% Weekly 28.13% 3.91% 3.43% Monthly 67.56% 57.14% 54.76% An application: CAPM with and without ARCH effects (ST) i. CAPM and the error distribution Daily, weekly, and monthly excess returns (1/2/2002 – 12/29/2006) from CRSP database (NYSE, AMEX, and NASDAQ)— 4547 companies Percent of stocks for which ST residual statistics are in 95% C.I. HO: Skewness=0 HO:Excess kurtosis=0 HO: Normal (JB) Daily 14.05% 0.02% 0% Weekly 28.82% 3.83% 3.39% Monthly 64.04% 54.72% 51.48% An application: CAPM with and without ARCH effects (IHS) i. CAPM and the error distribution Daily, weekly, and monthly excess returns (1/2/2002 – 12/29/2006) from CRSP database (NYSE, AMEX, and NASDAQ)— 4547 companies Percent of stocks for which IHS residual statistics are in 95% C.I. HO: Skewness=0 HO:Excess kurtosis=0 HO: Normal (JB) Daily 13.99% 0.02% 0% Weekly 27.89% 3.83% 3.36% Monthly 65.54% 55.71% 52.32% An application: CAPM with alternative error distributions Statistics of OLS residuals Company Name Skewness Kurtosis JB stat UNITED NATURAL FOODS INC -0.074 2.8004 0.1543 99 CENTS ONLY STORES 7.6594 85.0456 1.7541 Estimated Betas Company Name OLS T GT SGED EGB2 IHS ST SGT UNITED NATURAL FOODS INC 0.313 0.313 0.335 0.334 0.303 0.302 0.314 0.335 99 CENTS ONLY STORES 0.184 0.125 0.125 0.110 0.109 0.106 0.110 0.110 An application: CAPM with and without ARCH effects CAPM and the error distribution CAPM: how about ARCH effects? i. ii. Review: If errors are normal and no ARCH effects, OLS is MLE If errors are not normal and no ARCH effects OLS is BLUE, but not MLE nor efficient If errors are normal and have ARCH effects OLS is BLUE, but not efficient If errors are not normal and have ARCH effects OLS is BLUE,but not efficient An application: CAPM with and without ARCH effects CAPM: ARCH effects (continued) ii. Model: Yt X t t .5 2 1 t 1 t ut 0 Percent of stocks exhibiting ARCH(1) effects (OLS) (% rejecting HO : 1 0 ) 0.10 level 0.05 level Daily 63.2% 60.0% Weekly 29.2% 24.1% Monthly 18.7% 13.7% An application: CAPM with and without ARCH effects Percent of stocks exhibiting ARCH(1) effects (ST) (% rejecting HO : 1 0 ) 0.10 level 0.05 level Daily 63.2% 59.9% Weekly 29.1% 23.9% Monthly 16.9% 12.3% Percent of stocks exhibiting ARCH(1) effects (IHS) (% rejecting H : 0 ) O 1 0.10 level 0.05 level Daily 63.3% 60.0% Weekly 29.3% 24.1% Monthly 18.9% 13.9% An application: CAPM with and without ARCH effects ii. CAPM: ARCH effects (continued) ARCH Simulations yt 0 .9 X excess market returnt t , t= 1, …, 60 X monthly excess market returns, 1/2002 to 12/31/2006 Error distributions t ~ N 0, 2 u ARCH N 1 : t ut 0 1 t1 ARCHt 1 : t t 2 1 t1 2 0 .5 .5 where ut ~ N 0,1 where ut ~ t (5) An application: CAPM with and without ARCH effects ARCH Simulations (continued) Root Mean Square Error (RMSE) for 10,000 replications N ( 0 , σ^2 ) Er r o r s N ( 0 ,1) , A r ch( 1) t ( 5) , A r ch( 1) Est imat io n N o n- A R C H ARCH N o n- A R C H ARCH N o n- A R C H ARCH OLS/Normal 0.352 0.356 0.347 0.291 0.353 0.300 LAD 0.444 0.446 0.397 0.369 0.315 0.297 T 0.358 0.363 0.338 0.293 0.283 0.265 GED 0.381 0.389 0.357 0.318 0.306 0.285 GT 0.387 0.396 0.362 0.322 0.306 0.286 SGED 0.406 0.417 0.374 0.341 0.318 0.297 EGB2 0.371 0.376 0.352 0.312 0.300 0.281 IHS 0.368 0.377 0.348 0.319 0.291 0.275 ST 0.375 0.382 0.350 0.310 0.293 0.277 SGT 0.409 0.420 0.376 0.344 0.316 0.297 Statistical Distributions Introduction 2. Some families of statistical distributions 3. Regression applications Y X 4. Censored regression models a. Basic framework b. Simulation study 5. Qualitative response models 6. Option pricing 7, VaR (value at risk) 8. Conclusion 1. * i i i Censored Regression a. Basic Framework Model: yi* X i i yi yi* if y*i 0 * i = 0 if y < 0 Log-likelihood function: , n f yi X i ; n F X i ; yi 0 yi* 0 b. Censored regression: nonnormality and heteroskedasticity Qualitative Response Models Statistical Distributions Introduction Some families of statistical distributions Regression applications Censored Regression models Qualitative response models 1. 2. 3. 4. 5. a. 6. 7. 8. Basic framework Option pricing VaR (value at risk) Conclusion Qualitative Response— Basic Framework yi* X i i Model: yi 1 if y 0 and 0 otherwise * i Pr yi 1 X i Pr yi* X i i X i Pr i X i Log-likelihood function: Xi f s; ds F X ; i , yi n F X i ; 1 yi n 1 F X i ; n i 1 Qualitative Response— Basic Framework (continued) MLE of as will be consistent and asymptotically distributed ˆ 1 2 d ˆ ~ a N ; E d d ' if the model is correctly specified. Probit and logit estimators will be inconsistent if The error distribution is incorrectly specified heteroskedasticity exists, e.g. unmeasured heterogeneity is present relevant variables have been omitted The index appears in a nonlinear form Similar results are associated with Censored & Truncated regression models Statistical Distributions Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models 1. 2. 3. 4. 5. a. b. 6. 7. 8. Basic framework An application: fraud detection Option pricing VaR (value at risk) Conclusion Qualitative response— An application: fraud detection Prediction of corporate fraud (Y=1 fraud) Compare financial ratios of companies with averages of five largest companies (“virtual” firm) 228 companies (114 fraud and 114 non-fraud) Variables: accruals to assets, asset quality, asset turnover, days sales in receivables, deferred charges to assets, depreciation, gross margin, increase in intangibles, inventory growth, leverage, operating performance margin, percent uncollectables, receivables growth, sales growth, working capital turnover. SGT, EGB2, & IHS formulations improve predictions Statistical Distributions Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models 1. 2. 3. 4. 5. a. b. c. 6. 7. Basic framework An application Some related issues Option pricing VaR (value at risk) Qualitative response— Some related issues Cost of misclassification Choice-based sampling Heterogeneity Semi-parametric estimation procedures Statistical Distributions 1. 2. 3. 4. 5. 6. 7. 8. Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing: European call option VaR (value at risk) Conclusion Statistical Distributions Introduction Some families of statistical distributions Regression applications Qualitative response models Option pricing: European call option 1. 2. 3. 4. 5. a. 6. 7. The Black-Scholes option pricing formula VaR (value at risk) Conclusion Option pricing— Black-Scholes a. The Black Scholes option pricing formula The equilibrium price of a European call option is equal to the present value of its expected return at expiration: C f ST , T , X e rT E C S0 , 0 e rt S X f S S T , T dS X X ST ;1 e rT X ST . X ;0 St y s f s ds s f s ds h h where y; h 1 Ey h incomplete” moments y E yh involve “normalized y ; h 1 y ; h Statistical Distributions Introduction Some families of statistical distributions Regression applications Qualitative response models Option pricing: European call option 1. 2. 3. 4. 5. a. b. 6. 7. The Black-Scholes option pricing formula Some background and alternative formulations VaR (value at risk) Conclusion Option pricing– Some background and alternative formulations The Black Scholes (1973) option pricing formula corresponds to f s being the lognormal LN The Black Scholes formula (Bookstaber and McDonald, 1991) corresponding to the Generalized Gamma is obtained from h GG y; h GG y; a, , p , the cdf for the GG a The Black Scholes formula ( Bookstaber and McDonald, 1991) corresponding to the GB2 is obtained from y; h LN y; h 2 , 2,the cdf for the lognormal h h GB 2 y; h GB 2 y; a, b, p , q , a a the cdf for the GB2 Rebonato (1999) applied CGB2 ST , T , X to the Deutschemark Option pricing– Some background and alternative formulations Sherrick, Garcia, and Tirupattur (1996) used CBurr 3 ST , T , X to price soybean futures. Theodosiou (2000) developed the CSGED ST , T , X Savickas (2001) explored the use of CWeibull ST , T , X Dutta and Babbel (2005) explore the g- and h- family (4-parameter) of option pricing formulas, Cg &h ST , T , X , based on Tukey’s nonlinear transformation of a standard normal. Applied the g-and-h to pricing 1-month and 3-month London Inter Bank Offer Rates (LIBOR) g- and- h distribution and GB2 perform much better (errors fairly highly correlated) than the Lognormal, Burr 3, and Weibull distributions Statistical Distributions Introduction Some families of statistical distributions Regression applications Qualitative response models Option pricing: European call option 1. 2. 3. 4. 5. a. b. c. 6. 7. The Black-Scholes option pricing formula Some background and alternative formulations A comparison of pricing behavior VaR (value at risk) Conclusion A comparison of pricing behavior A comparison of pricing behavior (Dutta and Babbel, Journal of Business, 2005) c. Calculates the difference between the market price and predicted price for the g-and-h, GB2, lognormal, Burr3, and Weibull distributions Option Pricing Statistical Distributions 1. 2. 3. 4. 5. 6. 7. Introduction Some families of statistical distributions Regression applications Qualitative response models Option pricing: European call option VaR (value at risk) Conclusion Statistical Distributions Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing: European call option VaR (value at risk) 1. 2. 3. 4. 5. 6. 7. a. 8. Background and definitions Conclusion VaR—Background and definitions i. Value at risk (VaR) is the maximum expected loss on a portfolio of assets over a certain time period for a given probability level. R f R; dR R is the return on the asset θ denotes the distributional parameters α is the predetermined confidence level or coverage probability R is the corresponding maximum expected loss or conditional threshold R FR1 : VaR—Background and definitions ii. Standardized returns R z R Z R f Z z; dz , Z FZ1 : FZ1 : VaR—Background and definitions iii. Unconditional VaR formulation Estimate f(R;θ) VaR—Background and definitions iv. Conditional VaR formulation (AR(1) ABSGARCH(1,1)) Rt 0 1Rt 1 Zt t t zt t t 0 1 zt 1 t 1 2 t 1 Rt t t F 1 Z : Statistical Distributions Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing: European call option VaR (value at risk) 1. 2. 3. 4. 5. 6. 7. a. b. 8. Background and definitions Models and applications Conclusion VaR— Models and applications Unconditional VaR formulation i. Exponential: (Hogg, R. V. and S. A. Klugman (1983)) Gamma: (Cummins, et al. 1990) Log-gamma: (Ramlau-Hansen (1988)), (Hogg, R. V. and S. A. Klugman (1983)) Lognormal: (Ramlau-Hansen (1988)) Stable: (Paulson and Faris (1985) Pareto: (Hogg, R. V. and S. A. Klugman (1983)) Log-t: (Hogg, R. V. and S. A. Klugman (1983)) Weibull: (Cummins et al. (1990)) VaR— Models and applications Unconditional VaR formulation (continued) i. Burr: (Hogg, R. V. and S. A. Klugman (1983)) Generalized Pareto: (Hogg, R. V. and S. A. Klugman (1983)) GB2: (Cummins (1990, 1999, 2007) Pearson family: Aiuppa (1988) Extreme value distribution: Bali (2003), Bali and Theodossiou (2008) IHS: Bali and Theodossiou (2008) VaR— Models and applications ii. Conditional VaR formulations (Bali and Theodossiou, JRI, 2008) Data: S&P500 composite index, 1/4/50 – 12/29/2000 (n=12,832) Daily percentage log-returns: (Sample mean = .0341, maximum=8.71, minimum=-22.90 standard deviation = .874 skewness =1.622 kurtosis=45.52 VaR— Models and applications ii. Conditional distributions (Bali and Theodossiou, JRI, 2008) (continued) Models Generalized extreme value EGB2 SGT IHS Findings Out of sample VaR estimates are rejected for most unconditional specifications Thresholds exhibit time varying behavior Out of sample VaR estimates for the conditional specifications corresponding to the SGT, IHS, and EGB2 perform better than the extreme value distributions Selected references for option pricing and VaR Aiuppa, T. A. 1988. “Evaluation of Pearson curves as an approximation of the maximum probable annual aggregate loss.” Journal of Risk and Insurance 55, 425-441 Bali, T. G., 2003. “An Extreme Value Approach to Estimating Volatility and Value at Risk,” Journal of Business, 76:83-108 Bali, T. G. and P. Theodossiou, 2007. “A Conditional-SGT-VaR Approach with Alternative GARCH Models,” Annals of Operations Research, 151: 241-267. Bali, T. G. and P. Theodossiou, 2008. “Risk Measurement Performance of Alternaitve Distribution Functions,” Journal of Risk and Insurance, 75: 411-437. Black, F (1976). The Pricing of Commodity Contracts. Journal of Financial Economics 3:169-179. Cummins, J. D., G. Dionne, J. B. McDonald, and B. M. Pritchett 1990. “Applications of the GB2 family of distributions in modeling insurance loss processes.” Insurance: Mathematics and Economics 9, 257-272. Cummins, J. D., C. Merrill, and J. B. McDonald, 2007. “Risky Loss Distributions and Modeling the Loss Reserve Pay-out Tail,” Review of Applied Economics 3. Cummins, J. D., R. D. Phillips, and S. D. Smith 2001. “Pricing Excess of Loss Reinsurance Contracts against catastrophic loss.” In Kenneth Froot, ed., The Financing of Catastrophe Risk (Chicago: University of Chicago Press) Dutta, K. K. and D. F. Babbel 2005. “Extracting Probabilistic Information from the Prices of Interest Rate Options: Tests of Distributional Assumptions.” Journal of Business 78:841-870 Hogg, R. V. and S. A. Klugman, 1983. “On the Estimation of Long Tailed Skewed Distributions with Actuarial Applications.” Journal of Econometrics 23, 91-102. McDonald, J. B. and R. M. Bookstaber (1991). “Option Pricing for Generalized Distributions.” Communications in Statistics: Theory and Methods, 20(12), 4053-4068. Rebonato, R. (1999). Volatility and correlations in the pricing of equity. FX and interest-rate options. New York: John Wiley. Paulson, A. S. and N. J. Faris (1985). “A Practical Approach to Measuring the Distribuiton of Total Annual Claims.” In J. D. Cummins, ed., Strategic Planning and Modeling in Property-Liability Insurance. Norwell, MA: Kluwer Academic Publishers. Ramlau-Hansen, H. (1988). “A Solvency Study in Non-life Insurance. Part 1. Analysis of Fire, Windstorm, and Glass Claims.” Scandinavian Actuarial Journal, pp. 3-34. Rebonato, R. 1999. Volatility and correlations in the pricing of equity, FX and interest-rate options. New York: John Wiley. Reid, D. H. (1978). “Claim Reserves in General Insurance,” Journal of the Institute of Actuaries 105: 211-296 Savickas, R. (2001). A Simple option-pricing formula. Working paper, Department of Finance, George Washington University, Washington, DC. Sherrick, B. J., P. Garcia, and V. Tirupattur (1996). Recovering probabilistic information for options markets: Tests of distributional assumptions. Journal of Futures Markets 16:545-560. Theodossiou, Panayiotis, “Skewed Generalized Error Distribution of Financial Assets and Option Pricing,” Statistical Distributions 1. 2. 3. 4. 5. 6. 7. 8. Introduction Some families of statistical distributions Regression applications Censored regression Qualitative response models Option pricing: European call option VaR (value at risk) Conclusion Conclusion END OF PRESENTATION Appendices Cumulative distribution functions 1. 2. 3. 4. 5. 6. GB, GB1, GB2, GG EGB2 SGT SGED IHS g-and-h distribution Option pricing basics VaR—Models and applications discussion Appendices— Cumulative distribution functions 1. GB, GB1, GB2, and GG GB 1 y; a, b, p, q z p 2 F1 p,1 q; p 1; z pB p, q Bz p, q where z y / b z Bz p, q a and p 1 s 1 s ds 0 q 1 B p, q denotes the incomplete beta function Appendices— Cumulative distribution functions 1. GB, GB1, GB2, and GG (continued) GB 2 y; a, b, p, q z p 2 F1 p,1 q; p 1; z pB p, q Bz p, q y / b z a 1 y / b a where Appendices— Cumulative distribution functions 1. GB, GB1, GB2, and GG (continued) GG y; a, b, p e y/ p 1 y / a ap a F 1; p 1; y / 1 1 z p where z y/ a z and z p p 1 s s e ds 0 p denotes the incomplete gamma function Abramowitz and Stegun (1970, p. 932), McDonald (1984), and Rainville (1960,p. 60 and 125) Appendices— Cumulative distribution functions 2. EGB2 EGB 2 y; m,, p, q Bz p, q e y m / where z y m / 1 e 3. SGT 1 1 sign y m SGT y; m, , , p, q sign y m Bz 1/ p, q 2 2 where z ym y m q p p p 1 sign y m p Appendix— Cumulative distribution functions 4. SGED 1 1 sign y m SGED y; m, , , p sign y m z 1/ p 2 2 where z ym p 1 sign y m p p Appendices— Cumulative distribution functions 5. IHS IHS y; , , k, Pr Y y Pr Z z where N z; 0, 2 1 Pr Z z and 1 3 z 2 1 sign z 1 z2 / 2 1 F1 ; ; 2 2 2 2 2 2 2 y a y a z k n 1 k with b b 2 .5 .5 2 2 k 2 k k 2 / .5 e e 2 e 1 b /w 2 a bw b .5 e e e.5k 1 z 2 2 Appendices— Cumulative distribution functions 6. g- and h-distribution Numeric procedures, based on the use of order statistics as outlined in Exploring Data Tables, Trends, and Shapes by Hoaglin,, Mosteller, and Tukey (1985), Wiley. For h > 0, the transformation e gZ 1 hZ 2 / 2 Yg ,h Z a b e g is one-to-one, (Martinez, J. and B. Iglewicz . 1984. “Some Properties of Tukey g and h family of distributions,” Communications in Statistics—Theory and Methods 13, 353-369). Even without an explicit functional form for the inverse, numerical “MLE” estimates” can be obtained. Appendices Cumulative distribution functions Option pricing basics 1. 2. 3. 4. 5. 6. European call option Put option Definitions of terms Assumptions Volatility The Greeks VaR—Models and applications discussion Appendices— Option pricing basics 1. European call option C f ST , T , X , r e rT E C S0 , 0, X , r e rt S X f S ST , T dS X X rT ST ;1 e X ST 2. X ;0 St BS: S d e T 1 Put option BS Put formula : erT X d2 ST -d1 rT X d2 Appendices— Option pricing basics 3. Definitions of terms: T = time to expiration ST = Current market price r = interest rate (risk free rate) X = strike price (or exercise price) call options: price at which the instrument can be purchased up to expiration ST X profit per share gained upon exercising or selling the option ST X >0 in the money ST X <0 out of the money put options: price at which the instrument can be sold up to expiration Appendices— Option pricing basics 4. Assumptions: 5. Can short sell the underlying instrument No arbitrage opportunities Continuous trading in the instrument No taxes or transaction costs Securities are perfectly divisible Can borrow or lend at a constant risk free rate The instrument does not pay a dividend Volatility (in the BS option pricing formula— based on the LN) Appendices— Option pricing basics 6. The Greeks: (delta) measures the change in value of the instrument to a change in the current market price C f ST , T , X , r ST (kappa or vega) measures the responsiveness of the value of the instrument in response to a change in volatility C f ST , T , X , r (volatility ) (theta) responsiveness of the value of the instrument to T (time to expiration) X ;1 ST C f ST , T , X , r T (rho) responsiveness to changes in the risk free rate C f ST , T , X , r r Appendices Cumulative distribution functions Option pricing basics VaR—Models and applications discussion Appendices—VaR: Models and applications discussion Paulson and Faris (1985) used the stable family and Aiuppa (1988) used the Pearson family to model insurance losses Ramlau-Hansen (1988) modeled fire, windstorm, and glass claims using the log-gamma and lognormal Cummins, et al. (1990) modeled fire losses using the GB2 Cummins, Lewis, and Phillips (1999) used the LN, Burr 12, and GB2 to model hurricane and earthquake losses. Hogg, R. V. and S. A. Klugman, 1983. “On the Estimation of Long Tailed Skewed Distributions with Actuarial Applications.” Journal of Econometrics 23, 91-102 Models loss distributions (a. Hurricaines (1949-1980), b. malpractice claims paid for insured hospitals in 1975) Considers exponential, pareto (mixture of an exponential and inverse gamma), generalized pareto (mixture of gamma and inverse gamma), Burr distribution (mixture of a Weibull and inverse gamma), log-t (mixture of a lognormal and inverse gamma) and a log-gamma. Consider alternative estimation procedures: maximum likelihood and minimum distance estimators Many loss distributions are characterized by skewness and long tails such as associated with the flexible distributions coming from mixtures. Appendices—VaR: Models and applications discussion Cummins, J. D., G. Dionne, J. B. McDonald, and B. M. Pritchett, 1990. “Applications of the GB2 family of distributions in modeling insurance loss processes.” Insurance: Mathematics and Economics 9, 257-272. Models fire losses Considers the GB2 and special cases GG, BR3, BR12, LN, W, and GA to model the fire loss data. MLE estimates of distributional parameters and Maximum Probably Yearly Aggregate Loss (MPY) were obtained at the .01 level. Important to use distributions which permit thick tails Bali, T. G., 2003. “An Extreme Value Approach to Estimating Volatility and Value at Risk,” Journal of Business, 76:83-108 Appendices—VaR: Models and applications discussion Cummins, J. D., C. Merrill, and J. B. McDonald, 2007. “Risky Loss Distributions and Modeling the Loss Reserve Pay-out Tail,” Review of Applied Economics 3. Estimate aggregate loss distribution associated with claims incurred in a given year, but settled in different years Data: U.S. products liability insurance paid claims (Insurance Services Office (ISO)) Mixture model: Consider different GB2 distributions for each cell (year) Multinomial distribution for fraction of claims settled at different lags Single aggregate GB2 distribution for each year GB2 provides a significantly better fit to severity data than the LN, gamma, Weibull, Burr12, or generalized gamma The Aggregate GB2 distribution has a thicker tail than does the mixture distribution Appendices—VaR: Models and applications discussion Bali, T. G. and P. Theodossiou, 2008. “Risk Measurement Performance of Alternative Distribution Functions,” Journal of Risk and Insurance, 75: 411-437. Models: Unconditional formulations Generalized Pareto Generalized extreme value Box-Cox extreme value SGED SGT EGB2 IHS Models: Conditional formulations (model time-varying VaR thresholds) Rt 0 1Rt 1 zt t t zt t t 0 1 zt 1 t 1 2t 1 Lt Appendices—VaR: Models and applications discussion Bali, T. G. and P. Theodossiou, 2008. “Risk Measurement Performance of Alternative Distribution Functions,” Journal of Risk and Insurance, 75: 411-437. (continued) Data S&P500 composite index (1/4/1950 to 12/29/2000) Daily percentage log-returns: (n=12,832 maximum=8.71 minimum=-22.90 skewness =1.622 kurtosis=45.52 Findings Out of sample VaR estimates are rejected for most unconditional specifications Thresholds exhibit time varying behavior Out of sample VaR estimates for the conditional specifications corresponding to the SGT, IHS, and EGB2 perform better than the extreme value distributions END OF APPENDICES