New Performance Measures for Credit Risk Models MPI 2014 William Morokoff Yuchang Huang Liming Yang Quantitative Analytics June 23, 2014 Permission to reprint or distribute any content from this presentation requires the prior written approval of Standard & Poor’s. Copyright © 2014 by Standard & Poor’s Financial Services LLC. All rights reserved. Agenda Overview of Credit Risk Corporate Probability of Default Models Performance Measures for PD Models Evaluating Performance Measures Ideas for a New Performance Measure 2 Overview of Credit Risk Overview of Credit Risk Credit Risk: Risk that a borrower will not make timely payment of interest or principal. Borrowers have a ‘default option’ – credit risk is the risk that the option will be exercised. Credit Instruments: • Bonds – corporate, sovereign, municipal, etc. • Corporate loans and CLOs • Consumer asset securitizations – credit card, auto loan, student loan, etc. • Real Estate securitizations – commercial, residential • Derivatives (embedded counterparty risk) • Insurance-linked bonds (e.g. catastrophe bonds) 4 Overview of Credit Risk Risk Management • Relative default risk categories (ratings) • Risk tends to grow exponentially by category • Probability of Default • Real world probability of exercising default option • PD(T | t0 , x0 ) • Loss Given Default • Exposure at Default • Portfolio Loss 5 Pricing • Credit spreads • Premium beyond risk free rate • Option Adjusted Spread • Probability of Default • Risk neutral measure • Loss Given Default • Implied Correlation • For derivatives on portfolios Corporate PD Models Corporate PD Models • Structural (Merton/KMV) Models • For publicly traded companies, equity is viewed as a call option on the asset value A of a firm with strike = face value of debt (default point D) log A / D • Distance to Default DD PD f ( DD) A T • Reduced Form/Default Intensity Models t PD(t ) 1 exp s ds 0 • Regression Models (logit/probit/etc.) PD 7 1 1 exp 0 T x PD Modeling Process Collect Firm and Market Data: • • • • Company specific financial ratios, debt levels, liquidity, measures, … Macro-economic and market data Equity price, volatility, rank, etc. Need many years and many firms Collect Default Data: • Tag each firm observation as survivor or defaulter period T Construct and Scale Factors Calibrate Model: • Factor selection (e.g. Greedy Forward) • Parameter calibration (MLE) • Out of sample evaluation Measure Performance • How well does model differentiate defaulters and survivors • How well does model fit the observed data 8 What Makes PD Modeling and Performance Measurement Difficult? • Data - Rare events like default are rare. • Finance isn’t Physics – No universal law holds through time, so relationships (i.e. weights on factors) may change through the calibration period and may not hold going forward. • There may not be a true PD (philosophical question) o There may not be any (knowable) probability measure associated with the future events that would lead to default. Such events may not knowable at this time. o As a consequence, it is difficult to think about the accuracy of a PD model in the usual sense of | PD_{True} - PD_{Model} |. • Correlation o Factors driving defaults are correlated o Measures of model performance generally implicitly assume independent observations Performance Measures for PD Models Accuracy Ratio (AR) • Sort firms/assets/obligors from riskiest to safest as predicted by the credit model (x-axis) and plot against fraction of all defaulted obligors. • Accuracy Ratio = B / (B + A) Note: The terms Gini Coefficient and Accuracy Ratio are often used interchangeably in credit modeling literature Receiver Operating Characteristic (ROC) • Plot the distribution of model score S for defaulters and non-defaulters • Note: For a perfect scoring model, there will be no overlap in the distributions whereas for a random/uninformative model, there will be 100% overlap • If the score is a PD, it is generally not possible to perfectly separate defaulter from nondefaulters. • Suppose a cutoff value C (i.e., ranks/scores less than C are potential defaulters and rank scores higher than C are potential survivors) • Given C, 4 outcomes are possible: • Incorrect decisions: S < C and survive (Type II) or S > C and default (Type I) • Correct decisions: S < C and default or S > C and survive ROC (cont.) • Define True Positive Rate as a function of C: # Defaults with PD > C TPR(C ) Total # Defaults • Define False Positive Rate as a function of C: # Non-Defaults with PD > C FPR(C ) Total # Non-Defaults • Plot TPR (C ) vs FPR (C ) for C = 1 0 The larger the AUC (area under the ROC curve - shaded region), the better the ranking of the model because the TPR is larger than the FPR. AUC TPR(C ) d FPR (C ) 0 1 Relationship between ROC and Accuracy Ratio • If the same weight is attributed to Type I vs. Type II errors, it can be shown that AUC and AR communicate the same information AR = 2.AUC – 1 • ROC is however a more general measure as different weights may be given to Type I and II errors. Typically, more weight may be given to Type I vs. Type II error: Example 1: Consuming a toxic mushroom (I) vs. throwing an edible one (II) Example 2: Giving a loan to a defaulting firm (I) vs. losing potential interest income by not extending credit to a non-defaulting firm (II) • In practice, AUC and AR are usually treated as equivalent. Likelihood and Goodness of Fit • • Log-Likelihood: ND j 1 k 1 Likelihood Ratio Test: Determine if model A fits the data better than model B. R 2 Deviance Test: Does the model fit the data well? D 2 L • L log PDi j log 1 PDi k R 2 LB LA • NS D 2 Chi-Squared Test: Does the model fit the data well? ND 1 PDi ( j ) j 1 PDi ( j ) 2 NS PDi ( k ) k 1 1 PDi ( k ) Evaluating Performance Measures Accuracy Ratio Calculation • Assume sample of N observations ordered such that 𝑷𝑫𝟏 ≥ 𝑷𝑫𝟐 ≥…≥ 𝑷𝑫𝑵 N M Xi X i 1 if defaulter i 1 1 yi M i X j 1 j 1 N yi 1 / 2 1 / 2 N i 1 ARN M 1 1 / 2 2 N Accuracy Ratio Calculation With some arithmetic: 1 M 2 N N ARN M i Xi 2 2N i 1 N M M 1 N N N Now take the limit as N goes to infinity! Limiting AR • Assume that there exists a distribution function F(PD) • Data set is considered as a sample of PDs from F(PD) iid. • PDs can be considered random variables. • For this calculation, we assume that each sampled PD is a true PD, i.e. the probability that the issuer defaults is exactly PD, so X i is a random variable and E X i PDi • In the limit as N goes to infinity, 1 1 M PD E PD PD dF ( PD) 1 F ( y )dy N 0 0 Limiting AR PD 2 y dF ( y ) dz 0 0 AR PD (1 PD) F 1 z 1 With some calculus: 1 1 F ( y )dy PD 2 AR 0 PD 1 PD Observations on Limiting AR • If {PD_i} and the associated {X_i} are considered random variables, then ARN is a random variable and E ARN AR O 1/ N 2 ARN O 1/ N • Full distribution of ARN can be computed with simulation. Observations on Limiting AR Conclusion: Even for a perfect PD model, as a performance measure for the model, Accuracy Ratio is a noisy (sample-size dependent) estimate of a quantity that depends only on the nature of the population, not the quality of the model. Special Cases • Point mass at 0 PD* 1: • AR 0 All sample observations have identical PD’s and therefore there is no ability to separate defaulters from non-defaulters. • Point masses at PD0 0 and PD1 1 : • AR 1 Part of population is guaranteed to survive and part is guaranteed to default, so perfect separation is possible. Special Case: K Buckets There are K distinct buckets each with a PD and a weight w representing a percentage of the total population such that K PD1 PD2 ... PDK and w 1 i 1 i K F ( x) wi [ PDi ,1] x i 1 K AR K w w i 1 j 1 i j max PDi , PD j PD PD 1 PD K PD wi PDi i 1 Example: 10 Buckets RC10 PD (%) % Defaulter s % Obligors 10.24% 50.05% 10.00% CAP 100% RC9 5.12% 75.07% 20.00% RC8 2.56% 87.59% 30.00% RC7 1.28% 93.84% 40.00% RC6 0.64% 96.97% 50.00% RC5 0.32% 98.53% 60.00% % Defaulters 80% 60% 40% 20% 0% RC4 0.16% 99.32% 70.00% RC3 0.08% 99.71% 80.00% RC2 0.04% 99.90% 90.00% RC1 0.02% 100.00% 100.00% 0% 10% 20% 30% 40% 50% 60% 70% 80% % Obligors Even with perfect risk categorization on a PD basis, accuracy ratio is only 71.66% 90% 100% Example: Increasing Default Levels Increases AR PD (%) % Defaulters % Obligors RC10 51.20% 50.05% 10.00% RC9 25.60% 75.07% 20.00% RC8 12.80% 87.59% 30.00% RC7 6.40% 93.84% 40.00% RC6 3.20% 96.97% 50.00% RC5 1.60% 98.53% 60.00% RC4 0.80% 99.32% 70.00% RC3 0.40% 99.71% 80.00% CAP 100% % Defaulters 80% 60% 40% 20% 0% RC2 0.20% 99.90% 90.00% RC1 0.10% 100.00% 100.00% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% % Obligors Increasing default levels (while maintaining percentage of defaults captured) increases AR from 71.66% to 78.19% 100% Changing AR by Changing Bucket Weights AR is maximized when only the extreme buckets (most risky and least risky) are loaded (for fixed average PD) • Maximize: w PD PD / PD PD K K 1 1 w2 ... wK 1 0 • Minimize: w PD PD / PD w PD PD / PD wK PD PD1 / PDK PD1 i i 1 wj 0 i 1 i i 1 PDi i 1 PDi otherwise PDi PD PDi 1 Implication: “Better” ARs may be obtained through increased “sampling” from the two extreme buckets Impact of Correlation in Large Pool Limit Simple single factor Gaussian Copula correlation model Main result: Increasing correlation improves AR for large pools, except for maximum correlation of 100% Impact of Correlation on Finite Pools Simulation method: Estimate mean AR over 100,000 trials • For finite pool there is a non-linear relationship between AR and default rates, ED(AR) ≠ AR(ED) • Main observation: Correlation improves mean AR for heterogeneous (in PD) portfolios but is detrimental to homogenous portfolios Ideas for a New Perfomance Measure Desirable Properties • A better performance measure for a PD model should focus on the correctness of the PD estimate relative to true PD, and not be skewed by the nature of the sample population. • The evaluation of a true model (firms default with exactly modeled PD frequency) would received close to perfect score. • Worst score would require significant mislabeling of (almost) guaranteed defaulters and survivors. • Ideally, correlation of defaulters in sample would be taken into account. A Few Ideas: • Measure difference between true PD and model s 1 Ex | PDTrue ( x) PDModel ( x) | 1 s 1 N N | X i 1 i PDModel ( xi ) | • Measure difference in PD distributions 1 s 1 | FTrue ( PD) FModel ( PD) | dPD 0 • Measure likelihood of ‘Portfolio of Defaults’ • Test whether number of defaults is consistent with a correlated portfolio model based on model PDs. Conditional Default PD Distribution Tests PD distribution: with density function: PD distribution Given Default: PD distribution Given No Default: with density function with density function TPR (true positive rate): FPR (false positive rate): And AUC can be expressed as: 1 AUC f D s ds f N x dx 0 x 1 Conditional Default PD Distribution Tests Assuming the PDs are the true default rate, by Bayes’ Theorem it can be shown that x FD x xF ( x) F ( s )ds 0 PD x FN x Note that AUC can be calculated as: (1 x) F ( x) F ( s )ds 0 1 PD Conditional Default PD Distribution Tests • Given a PD model, you can either compute F ( x) or estimate the distribution from the sample PDs. • From F ( x ) you can compute FD ( x ) • From the observed defaulters, you can compute the observed distribution of PDs conditional on default: F D ( x) • Apply the Kolmagorov Smirnov test, with n being the number of total observations and m being the number of defaulters, to : Dn ,m sup x | FD ( x) FD ( x) | Thank You William Morokoff Head of Quantitative Analytics T: 212.438.4828 william.morokoff@standardandpoors.com Permission to reprint or distribute any content from this presentation requires the prior written approval of Standard & Poor’s. Copyright © 2014 by Standard & Poor’s Financial Services LLC. All rights reserved. Copyright © 2014 by Standard & Poor’s Financial Services LLC. All rights reserved. No content (including ratings, credit-related analyses and data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified, reverse engineered, reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard & Poor’s Financial Services LLC or its affiliates (collectively, S&P). The Content shall not be used for any unlawful or unauthorized purposes. S&P and any third-party providers, as well as their directors, officers, shareholders, employees or agents (collectively S&P Parties) do not guarantee the accuracy, completeness, timeliness or availability of the Content. S&P Parties are not responsible for any errors or omissions (negligent or otherwise), regardless of the cause, for the results obtained from the use of the Content, or for the security or maintenance of any data input by the user. The Content is provided on an “as is” basis. S&P PARTIES DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OR USE, FREEDOM FROM BUGS, SOFTWARE ERRORS OR DEFECTS, THAT THE CONTENT’S FUNCTIONING WILL BE UNINTERRUPTED OR THAT THE CONTENT WILL OPERATE WITH ANY SOFTWARE OR HARDWARE CONFIGURATION. In no event shall S&P Parties be liable to any party for any direct, indirect, incidental, exemplary, compensatory, punitive, special or consequential damages, costs, expenses, legal fees, or losses (including, without limitation, lost income or lost profits and opportunity costs or losses caused by negligence) in connection with any use of the Content even if advised of the possibility of such damages. Credit-related and other analyses, including ratings, and statements in the Content are statements of opinion as of the date they are expressed and not statements of fact. S&P’s opinions, analyses and rating acknowledgment decisions (described below) are not recommendations to purchase, hold, or sell any securities or to make any investment decisions, and do not address the suitability of any security. S&P assumes no obligation to update the Content following publication in any form or format. The Content should not be relied on and is not a substitute for the skill, judgment and experience of the user, its management, employees, advisors and/or clients when making investment and other business decisions. S&P does not act as a fiduciary or an investment advisor except where registered as such. While S&P has obtained information from sources it believes to be reliable, S&P does not perform an audit and undertakes no duty of due diligence or independent verification of any information it receives. To the extent that regulatory authorities allow a rating agency to acknowledge in one jurisdiction a rating issued in another jurisdiction for certain regulatory purposes, S&P reserves the right to assign, withdraw or suspend such acknowledgement at any time and in its sole discretion. S&P Parties disclaim any duty whatsoever arising out of the assignment, withdrawal or suspension of an acknowledgment as well as any liability for any damage alleged to have been suffered on account thereof. S&P keeps certain activities of its business units separate from each other in order to preserve the independence and objectivity of their respective activities. As a result, certain business units of S&P may have information that is not available to other S&P business units. S&P has established policies and procedures to maintain the confidentiality of certain non-public information received in connection with each analytical process. S&P may receive compensation for its ratings and certain analyses, normally from issuers or underwriters of securities or from obligors. S&P reserves the right to disseminate its opinions and analyses. S&P's public ratings and analyses are made available on its Web sites, www.standardandpoors.com (free of charge), and www.ratingsdirect.com and www.globalcreditportal.com (subscription), and may be distributed through other means, including via S&P publications and third-party redistributors. Additional information about our ratings fees is available at www.standardandpoors.com/usratingsfees. STANDARD & POOR’S, S&P, GLOBAL CREDIT PORTAL and RATINGSDIRECT are registered trademarks of Standard & Poor’s Financial Services LLC.