Bridging the Academic–Practitioner Divide in Credit Risk Modeling Vadim Melnitchouk, Metropoliten State University, Saint Paul, MN, US Agenda Academic model selection by a practitioner and organizational issues 1. 2. ‘Optimal complexity model’ : stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity 3. Data access , collaboration & prototype development Who is a practitioner? 1. Ph. D in applied math, former academic, teaching part-time ‘Data Mining’. 2. Ph. D in physics, former academic 3. M.S. in OR, former ‘Fed’ examiner 4. M.S. in Econometric A practitioner’s search for the right academic paper /model Paper/Methodology Potential Business Impact Andreeva, Ansell & Crook 'Modeling Profitability using Survival Combination Scores' Increase Profitability Organizational issue How to get CRO & CMO to agree on the same KPIs? Belloti & Crook Forecasting More accurate estimation US Banks are getting a and Stress Testing Credit for unexpected losses, stress test scenario from Card Default..' Economic Capital Reduction Regulators Fader & Hardie 'CustomerBase Analysis with Discrete- Increase Sales, prevent time …' Customer Attrition Was implemented at GE Money in 2008-2009 Fader & Hardie 'CustomerBase Analysis with Discretetime …' Reduce losses Cultural resistence Leow & Crook 'Intensity Models and Transition Probabilities ‘ Feasible, but optimal complexity model is required Reduce losses Time to Default: Optimal complexity model 1. According to Bellotti & Crook (2007) survival (hazard) modeling is competitive alternative to logistic regression when predicting default events. 2. The method has become a model of choice in recent publications. But its complexity makes such technique unfeasible for practitioners. 3. It also has some limitations. Bellotti (2010) believes that ‘any credit risk model with macroeconomic variables can’t be expected to capture the direct reason for default like a loss of job, negative equity or a sudden personal crisis such as sickness or divorce’. Methodology The goal of this paper is to present more practical method which also can take unobserved obligor heterogeneity into account. Stochastic parametric Time to Event method is well known in marketing (Hardie & Fader, 2001). It was also applied by Brusilovskiy (2005) to predict the time of the first home purchase by immigrants. The method as far as we know has not been used in credit risk by academics or practitioners. Assumptions & inputs 1. Time to Default - Weibull distribution (Appendix) 2. Default density across obligors - Gamma distribution (to include unobserved consumer heterogeneity). 3. Vintage aggregate level modeling to avoid so called aggregation bias when unemployment is used. Inputs: 1. Monthly number of defaults 2. Time varying covariates : Unemployment and Home Price Index (HPI). Macroeconomic factors are incorporated into the hazard rate function. Recent trends in mortgage default rate & data 1. The default rates have spiked from historical trends in 2005 and more significantly in 2006 & 2007 beginning almost immediately after origination. 2. Average time to reach maximum default rate decreased from 5-6 (Vintage 2001-2004) to 2-3 years (Vintage 2005-2007) 3. LPS prime, first, fixed rate 30 years mortgage originated in 2006 data were used to build a model (Schelkle, 2011). Model training and out-of-time validation 1. Model training period for vintage 2006 was June 2006 – March 2009. 2. April 2009 to March 2010 period was selected for ‘out of time’ validation because unemployment increased from 8.5% to 10.1% during this period. 3. The model was implemented in MS Excel (using Solver) and in SAS/IML. Maximum likelihood was estimated to get values for five parameters. Forecasted vs Actual monthly # of defaults Weibull/Gamma model for 2006 mortgage origination year (LPS data, vintage 2006). 140000 120000 100000 80000 Pred 60000 40000 20000 0 Act Results & Discussion The forecast accuracy for ‘out-of-time’ period is at acceptable level (low forecast error and conservative estimate for regulators). Issues with one segment model: 1. 2. Time varying covariates formula is taken from marketing application and is not flexible one for credit risk modeling (Appendix). The impact of unemployment and HPI can be double counted. Next steps in collaboration with academics 1. Bayesian parameters’ estimation was applied in collaboration with Prof. Shemyakin (St.Thomas University, St. Paul. MN) and his students to improve numeric stability. Two segments latent class Weibull model (Appendix) was also used to estimate parameters of consumer segment with default hazard increasing over time. Unemployment and HPI were not included to avoid double counting (academic’s preference). Data access and three levels of collaboration Collaboration Academic's level Execution by Motivation Practitioner's Motivation Looking over your shoulder Practitioner Joint supervision Student Bridging the Academic– Academic Practitioner and Divide practitioner Data Access Academic Partner Marketing and validation Apply new method (professional growth) N/A Prof. Fader & Prof. Hardie Real life project for a student Additional validation & enhancement Vintage aggregated data only Prof. Shemyakin, June 2012 ? Resolve real issue like wrong signs in multinomial Aggregated by regression delinquency status coefficients ? Data access 1. It is very problematic to get loan level data from financial firms for joint projects. 2. Aggregate level delinquency and default data for mortgages, credit cards , installment loans and commercial lending can be extracted from public websites. 3. But data decomposition of completely aggregated data like Federal Reserve one (Appendix) should be implemented first to apply vintage based modeling. From a prototype to production: possible collaboration Model description Non-stationary Markov Chain model with hazard functions and macroeconomic variables Model Category Production Scope Consumer & Commercial Non-stationary Markov Chain model with multinomial transition functions and Consumer & macroeconomic variables Production Commercial Experiment with a second order Markov Chain Research Commercial Forecasting Time to Delinquency using Stochastic Parametric Model Benchmarking Consumer Predicting delinquent loans’ recovery using Stochastic 'Choice' Model Benchmarking Consumer Major Issue Possible solution Zero values for some transition coefficients Bayesian estimator/ Gibbs sampling? Wrong signs in some transition coefficients To many parameters, small sample size for some transitions MCMC MLE estimation numerical stability Bayesian estimation Not included in SAS, R, etc., no standard tests Alternative to Markov model ? Next search for optimal complexity model: Combined Markov Chain and Survival Analysis Model description Macroeconomic variables Objective function Major Issue Possible solution Next step Partial MLE ? N/A Yes Partial MLE Correlated event times Clustering Least Sq. Migration underestimation Bayesian MCI (Christodoulakis) MPLE Zero values in some transition coefficients Gibbs sampling MLE Statistical significance for some transitions Bayesian estimator Leow & Crook 'Intensity Models and Transition Probabilities ‘ Louis, Laere, Baesens ‘Predicting bank rating transitions..’ Jones ‘Estimating Markov Transition’ Kunovac ‘Estimating Credit Migration …– Bayesian Approach, Grimshaw & Alexander ‘Markov Chain model for delinquency..’ Yes No No Conclusions Stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity can be used by practitioners as an alternative to survival modeling The optimal complexity model can provide an incentive to try to bridge the Academic –Practitioner Divide Appendix Latent class Weibull model with two segments Assumptions: 1. All obligors can be divided into two segments with their own fixed but unknown values of shape and scale parameters. Large segment has decreasing default hazard. 3. Relatively small consumer segment exists with default hazard increasing over time . The segment size (percentage) is latent variable which must be estimated for each vintage. 2.