Discrete Survival Models with Macroeconomic Variables for Retail Stress Tests Dr Tony Bellotti Department of Mathematics, Imperial College London a.bellotti@imperial.ac.uk Joint work with Prof Jonathan Crook Credit Research Centre, University of Edinburgh Business School RSS Credit Risk workshop, 13 June 2013 Outline of this presentation 1. Motivation. 2. Model structure for including macroeconomic data. 3. Forecasting using the model. 4. Scenario stress testing of a portfolio using the model. 5. Stress tests based on Monte Carlo simulation. 6. Conclusion. Motivation – Models for stress testing • Stress testing requires a projection of income for a financial institution based on hypothetical adverse economic conditions. • The adverse conditions may be provided as economic scenarios by regulatory bodies or may be generated internally. • The projections of income are based partly on aggregation of projected losses on loan portfolios. • Therefore there is a need for statistical/econometric models for different loan types which can be used to accurately project possible losses conditional on different economic scenarios. Motivation – Survival models Why survival models? • Natural to model time to default; • Inclusion of dynamic data: behavioural and economic; • Calculation of expected profit over the lifetime of loans. Why include macroeconomic variables? • Build models that are sensitive to changes in the economy. • Forecast default rates and expected losses more accurately. • Stress test using statistical models allowing the inclusion of economic scenarios. Discrete survival model of credit default ο§ The discrete survival model covers most options required of a dynamic credit risk model, whilst being computationally efficient. ο§ The following types of variables can be included: • • • • • • • Static variables about obigor (ie application or bureau data). Duration (age of account) effect. Vintage effects. Behavioural variables (BVs) about credit usage and repayments. Macroeconomic variables (MVs). Frailty term (ie unknown account-level effect). Unknown systemic (calendar time) effect. Model structure π πππ‘ = 1|πππ = 0 for π < π‘, π°π , π± π π‘−π , π³ππ+π‘−π = πΉ π½0 + πππ π π‘ + πππ° π°π + πππ± π± π π‘−π + π π + πππ³ π³ππ+π‘−π + πΎππ +π‘ where πππ‘ π π‘ π°π π±π π‘−π ππ π π π³ππ +π‘−π πΎππ +π‘ Outcome on account π after some duration π‘: 1 = default, 0 = non-default. Typically, duration π‘ is age of the account. Non-linear transformation of duration; Baseline hazard. eg, π π‘ = π‘, π‘ 2 , log π‘ , log π‘ 2 Static variables; eg application variables and vintage effect. Behavioural variables over time (with some lag π). Date of origination of account π. Frailty term on account π. Macroeconomic variables over calendar time ππ + π‘ (with some lag π). Unknown systemic (calendar time) effect. Model estimation • This is a panel model structure over accounts π and duration π‘. • Need to specify a link function πΉ. This could be logit or probit. • Taking πΉ to be complementary log-log, ie πΉ π = 1 − exp −exp π , yields a discrete version of the Cox proportional hazard model. • Most of the variables are included as fixed effect terms. • Frailty π π and systematic effect πΎππ +π‘ are included as random effect terms, assumed to be normally distributed. • Maximum marginal likelihood can be used to estimate coefficients on fixed effects (π½0 , ππ , ππ° , ππ± , ππ³ ) and variance of the random effects. Using the model: Forecasts • For forecasts, we may be interested in computing the probability of default (PD) within some time π’ of opening an account: π’ ππ· π, π’ = 1 − 1 − π πππ‘ = 1|πππ = 0 for π < π‘, π°π , π± π π‘−π , π³ππ +π‘−π π‘=1 • At aggregate (portfolio) level we are interested in expected value of default rate (DR) at some calendar time period π: π 1 πΈ DR = π ππ π−ππ = 1|πππ = 0 for π < c − ππ , π°π , π± π c−ππ−π , π³π−π π π=1 where π is the number of accounts such that ππ < π. Using the model: Forecasting and stress testing • Notice that for DR estimates, even if coefficient estimates on macroeconomic variables are small, they may have large overall effect because they are affecting all accounts in the same way at the same (calendar) time. • Because the PD and DR are estimated as functions of macroeconomic variables, the essence of stress testing is to use values of these variables corresponding to recession scenarios to adjust the forecasts. • Stress testing using scenarios then gives a point estimate adjustment. • Generating economic simulations from a distribution of plausible future economic conditions, enables a distribution of estimates for PD and DR. Illustration of scenario stress test • • • Suppose we want to stress test PD on a UK credit card portfolio. Include several AVs and BVs. Include two MVs: UK interest rates (IR) and unemployment rate (UR) in millions, as differences over 12 months. • • Build a discrete survival model with logit link function. Estimate coefficients on IR and UR as 0.11 and 0.67 respectively (note, these figures are based on real model results). • Suppose a log-odds score for a customer during “normal” conditions (IR=0 and UR=0) is -5.45. Then suppose a stress scenario #1: IR=+1.5 and UR=+0.25. And worse scenario #2: IR=+3 and UR=+1. • • • How do the stressed scenarios affect the estimate of PD over a yearly period? Example of scenario stress test Simplified version of calculation of stressed PD for illustration…. Normal condition Scenario #1 Scenario #2 -5.45 -5.45 -5.45 0 +1.5 +3 0 +0.165 +0.33 Unemployment rate (difference) 0 +0.25 +1 … contribution to score (s3) 0 +0.168 +0.67 Total log-odds score (s1+s2+s3) -5.45 -5.12 -4.45 PD in one month 0.0043 0.0060 0.012 PD over one year 0.05 0.069 0.13 Base score (s1) UK interest rate (difference) … contribution to score (s2) Therefore, stressed conditions (#1) represents an increase in PD of almost 38%. Some issues for including variables in the model • Behavioural variables (BVs) should have long lags (eg a year) to enable forecasting ahead. • Macroeconomic variables (MVs) do not necessarily require long lags since macroeconomic models can be used to forecast ahead (eg GDP forecast models). • BVs and MVs: Possibly include as aggregates (eg mean, max, min or geometric lag). • Trends in MVs: Use differences; eg growth in GDP instead of GDP. • Correlations amongst MVs…. Correlation amongst macroeconomic variables • • • We expect MVs to be highly correlated. This could lead to problems with the model estimation (multicollinearity). Two tactics to deal with this: οVariable selection: Hope to remove highly correlated variables. • Problem: This method is somewhat crude. οFactor analysis: Reduce MVs to orthogonal components. π§1 π§2 Factor analysis (eg PCA) π§3 π§4 “Raw” MV time series • Component 1: π§ ∗ 1 Component 2: π§ ∗ 2 Macroeconomic “factors” Example: Chicago Fed economic index (CFNAI). Model over the business cycle / population drift • Train over a whole business cycle, if possible, to get good estimates for MVs. οHow much of the business cycle is needed? 5 years? οExplore breakpoints in the economy and effect on the model. • Dealing with old data and population drift in the credit risk model. οThat is, the MV estimate may require several years of data, whereas a useful default model may use just one year. οPossible solution: Two-part model… 1. Build long-term model with MV estimates; 2. Build default model on short amount of recent data; 3. Merge the two. Result 1: Including MVs We now present some results using the discrete survival model for forecasting DR, stress testing and asset correlation estimates. • UK credit card data covering a period from 1996 to mid-2006: • Training data 1996 to 2004 (over 400,000 accounts); • Out-of-sample test data 2005 to mid-2006 (over 150,000 accounts). • Includes application variables (AVs), behavioural variables (BVs) and several macroeconomic variables (MVs). • Two MVs were found to be statistically significant at 5% level: Bank interest rate (aggregate) Positive coefficient Unemployment rate Positive coefficient These results will be published soon in the International Journal of Forecasting. Results 2: Forecasting DR (reality check) • Three alternative models are built with different variables to test forecasts of default rates (DR). • The mean absolute difference (MAD) between estimated an observed DR on the test data set is used as a performance measure. Model MAD AV only 0.087 AV and BV lag 12 months 0.058 AV, BV lag 12 & MV lag 3 months 0.049 • The model with MVs gives the best result with lowest MAD. • This result is consistent across the test set over time. Result 3: Stress testing using simulation • Monte Carlo simulation based on drawing economic scenarios from a plausible distribution of MVs. Median Observed DR VaR (99% level) Expected shortfall (99% level) 0.5 0.75 1 1.25 Estimated default rate (as ratio of median value) 1.5 1.75 2 Region of expected shortfall calculation (99% level) 2.4 • VaR (99%) = 1.59 times median. • Expected shortfall (99%) = 1.73 times median. Schema for stress test simulation method Test data set Monte Carlo simulation To compute Value at Risk and Expected shortfall Distribution over MVs Credit risk Model with MVs Random draws Compute DR / losses Loss distribution Value at Risk, and Expected shortfall Stress testing by simulation • Estimate macroeconomic distribution (MD) using historical values of MVs. • Draw scenarios from the MD: • Normalization and Cholesky decomposition (to deal with collinearity between MVs). • Extreme but plausible events. • Including other elements in the simulation: • Random effects at account (estimates) and calendar time levels (draw from random effect distribution). • Model uncertainty (ie uncertainty in coefficient estimates). Conclusion 1. The discrete survival model with random effects is a rich model that allows us to consider a variety of important risk factors over time. 2. It is computationally efficient. 3. Empirical evidence shows the model gives improved forecasts and plausible stress test results. 4. Requires some careful thought and analysis regarding exactly how time varying variables are included in the model. Future work and open problems 1. Two-part model to avoid the problem of population drift. 2. Further methodological work required on the statistical problem of stress testing using simulation. 3. Test the method on a variety of different credit data. Discrete survival models for stress testing Joint work by Prof Jonathan Crook* and Dr Tony Bellotti** * Credit Research Centre, University of Edinburgh Business School ** Department of Mathematics, Imperial College London • Email: a.bellotti@imperial.ac.uk • Web page: www2.imperial.ac.uk/~abellott/