New Performance Measures
for Credit Risk Models
MPI 2014
William Morokoff
Yuchang Huang
Liming Yang
Quantitative Analytics
June 23, 2014
Permission to reprint or distribute any content from this presentation
requires the prior written approval of Standard & Poor’s. Copyright © 2014
by Standard & Poor’s Financial Services LLC. All rights reserved.
Agenda
Overview of Credit Risk
Corporate Probability of Default Models
Performance Measures for PD Models
Evaluating Performance Measures
Ideas for a New Performance Measure
2
Overview of Credit Risk
Overview of Credit Risk
Credit Risk: Risk that a borrower will not make
timely payment of interest or principal.
Borrowers have a ‘default option’ – credit risk is
the risk that the option will be exercised.
Credit Instruments:
• Bonds – corporate, sovereign, municipal, etc.
• Corporate loans and CLOs
• Consumer asset securitizations – credit card, auto
loan, student loan, etc.
• Real Estate securitizations – commercial, residential
• Derivatives (embedded counterparty risk)
• Insurance-linked bonds (e.g. catastrophe bonds)
4
Overview of Credit Risk
Risk Management
• Relative default risk
categories (ratings)
• Risk tends to grow
exponentially by category
• Probability of Default
• Real world probability of
exercising default option
• PD(T | t0 , x0 )
• Loss Given Default
• Exposure at Default
• Portfolio Loss
5
Pricing
• Credit spreads
• Premium beyond risk free rate
• Option Adjusted
Spread
• Probability of Default
• Risk neutral measure
• Loss Given Default
• Implied Correlation
• For derivatives on portfolios
Corporate PD Models
Corporate PD Models
• Structural (Merton/KMV) Models
• For publicly traded companies, equity is viewed as a call option on the
asset value A of a firm with strike = face value of debt (default point D)
log  A / D 
• Distance to Default
DD 
PD  f ( DD)
A T
• Reduced Form/Default Intensity Models
 t

PD(t )  1  exp      s  ds 
 0

• Regression Models (logit/probit/etc.)
PD 
7
1

1  exp   0   T x

PD Modeling Process
Collect Firm and Market Data:
•
•
•
•
Company specific financial ratios, debt levels, liquidity, measures, …
Macro-economic and market data
Equity price, volatility, rank, etc.
Need many years and many firms
Collect Default Data:
•
Tag each firm observation as survivor or defaulter period T
Construct and Scale Factors
Calibrate Model:
• Factor selection (e.g. Greedy Forward)
• Parameter calibration (MLE)
• Out of sample evaluation
Measure Performance
• How well does model differentiate defaulters and survivors
• How well does model fit the observed data
8
What Makes PD Modeling and Performance
Measurement Difficult?
• Data - Rare events like default are rare.
• Finance isn’t Physics – No universal law holds through time, so
relationships (i.e. weights on factors) may change through the
calibration period and may not hold going forward.
• There may not be a true PD (philosophical question)
o There may not be any (knowable) probability measure associated
with the future events that would lead to default. Such events may
not knowable at this time.
o As a consequence, it is difficult to think about the accuracy of a PD
model in the usual sense of | PD_{True} - PD_{Model} |.
• Correlation
o Factors driving defaults are correlated
o Measures of model performance generally implicitly assume
independent observations
Performance Measures for PD
Models
Accuracy Ratio (AR)
• Sort firms/assets/obligors from riskiest to safest as predicted by the
credit model (x-axis) and plot against fraction of all defaulted obligors.
• Accuracy Ratio = B / (B + A)
Note: The terms Gini Coefficient and Accuracy Ratio are often used interchangeably in credit modeling literature
Receiver Operating Characteristic (ROC)
• Plot the distribution of model score S for defaulters and non-defaulters
• Note: For a perfect scoring model, there will be no overlap in the distributions whereas
for a random/uninformative model, there will be 100% overlap
• If the score is a PD, it is generally not possible to perfectly separate defaulter from nondefaulters.
• Suppose a cutoff value C (i.e., ranks/scores less than C are potential
defaulters and rank scores higher than C are potential survivors)
• Given C, 4 outcomes are possible:
• Incorrect decisions: S < C and survive (Type II) or S > C and default (Type I)
• Correct decisions: S < C and default or S > C and survive
ROC (cont.)
• Define True Positive Rate as a function of C:
# Defaults with PD > C
TPR(C ) 
Total # Defaults
• Define False Positive Rate as a function of C:
# Non-Defaults with PD > C
FPR(C ) 
Total # Non-Defaults
• Plot TPR (C ) vs FPR (C ) for C = 1  0
The larger the AUC (area under
the ROC curve - shaded region),
the better the ranking of the
model because the TPR is larger
than the FPR.
AUC   TPR(C ) d  FPR (C ) 
0
1
Relationship between ROC and Accuracy Ratio
• If the same weight is attributed to Type I vs. Type II
errors, it can be shown that AUC and AR communicate
the same information
AR = 2.AUC – 1
• ROC is however a more general measure as different
weights may be given to Type I and II errors.
Typically, more weight may be given to Type I vs. Type II error:
Example 1: Consuming a toxic mushroom (I) vs. throwing an edible one (II)
Example 2: Giving a loan to a defaulting firm (I) vs. losing potential interest
income by not extending credit to a non-defaulting firm (II)
• In practice, AUC and AR are usually treated as
equivalent.
Likelihood and Goodness of Fit
•
•
Log-Likelihood:

ND
j 1
k 1

Likelihood Ratio Test: Determine if model A fits the data
better than model B.
R
2
Deviance Test: Does the model fit the data well?
D  2 L
•

L   log PDi j    log 1  PDi k 
R  2  LB  LA 
•

NS
D

2
Chi-Squared Test: Does the model fit the data well?
ND
1  PDi ( j )
j 1
PDi ( j )
 
2
NS
PDi ( k )
k 1
1  PDi ( k )

Evaluating Performance
Measures
Accuracy Ratio Calculation
• Assume sample of N observations ordered
such that 𝑷𝑫𝟏 ≥ 𝑷𝑫𝟐 ≥…≥ 𝑷𝑫𝑵
N
M   Xi
X i  1 if defaulter
i 1
1
yi 
M
i
X
j 1
j

1  N
yi  1 / 2   1 / 2


N  i 1

ARN 
M 

1


1
/
2

2 N 
Accuracy Ratio Calculation
With some arithmetic:
1
M
2
N
N

ARN 
M 
 i 

  Xi 
2
2N 
i 1  N 
M M
1  
N
N
N
Now take the limit as N goes to infinity!
Limiting AR
• Assume that there exists a distribution function F(PD)
• Data set is considered as a sample of PDs from F(PD) iid.
• PDs can be considered random variables.
• For this calculation, we assume that each sampled PD is
a true PD, i.e. the probability that the issuer defaults is
exactly PD, so X i is a random variable and
E  X i   PDi
• In the limit as N goes to infinity,
1
1
M
 PD  E  PD    PD dF ( PD)  1   F ( y )dy
N
0
0
Limiting AR


PD  2    y dF ( y )  dz


0 0

AR 
PD (1  PD)
F 1  z 
1
With some calculus:
1
1   F ( y )dy  PD
2
AR 
0

PD 1  PD

Observations on Limiting AR
• If {PD_i} and the associated {X_i} are considered
random variables, then ARN is a random variable
and
E  ARN   AR  O 1/ N 

2
 ARN   O 1/ N 
• Full distribution of ARN can be computed with
simulation.
Observations on Limiting AR
Conclusion: Even for a perfect PD model, as a
performance measure for the model, Accuracy
Ratio is a noisy (sample-size dependent) estimate
of a quantity that depends only on the nature of the
population, not the quality of the model.
Special Cases
• Point mass at 0  PD*  1:
•
AR  0
All sample observations have identical PD’s and
therefore there is no ability to separate defaulters from
non-defaulters.
• Point masses at PD0  0 and PD1  1 :
•
AR  1
Part of population is guaranteed to survive and part is
guaranteed to default, so perfect separation is
possible.
Special Case: K Buckets
There are K distinct buckets each with a PD and a
weight w representing a percentage of the total
population such that
K
PD1  PD2  ...  PDK
and
w 1
i 1
i
K
F ( x)   wi [ PDi ,1]  x 
i 1
K
AR 
K
 w w
i 1 j 1
i
j
max  PDi , PD j   PD

PD 1  PD
K
PD   wi PDi
i 1

Example: 10 Buckets
RC10
PD (%)
%
Defaulter
s
%
Obligors
10.24%
50.05%
10.00%
CAP
100%
RC9
5.12%
75.07%
20.00%
RC8
2.56%
87.59%
30.00%
RC7
1.28%
93.84%
40.00%
RC6
0.64%
96.97%
50.00%
RC5
0.32%
98.53%
60.00%
% Defaulters
80%
60%
40%
20%
0%
RC4
0.16%
99.32%
70.00%
RC3
0.08%
99.71%
80.00%
RC2
0.04%
99.90%
90.00%
RC1
0.02%
100.00%
100.00%
0%
10%
20%
30%
40%
50%
60%
70%
80%
% Obligors
Even with perfect risk categorization on a PD basis,
accuracy ratio is only 71.66%
90%
100%
Example: Increasing Default Levels Increases AR
PD (%)
%
Defaulters
%
Obligors
RC10
51.20%
50.05%
10.00%
RC9
25.60%
75.07%
20.00%
RC8
12.80%
87.59%
30.00%
RC7
6.40%
93.84%
40.00%
RC6
3.20%
96.97%
50.00%
RC5
1.60%
98.53%
60.00%
RC4
0.80%
99.32%
70.00%
RC3
0.40%
99.71%
80.00%
CAP
100%
% Defaulters
80%
60%
40%
20%
0%
RC2
0.20%
99.90%
90.00%
RC1
0.10%
100.00%
100.00%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
% Obligors
Increasing default levels (while maintaining percentage of
defaults captured) increases AR from 71.66% to 78.19%
100%
Changing AR by Changing Bucket Weights
AR is maximized when only the extreme buckets (most
risky and least risky) are loaded (for fixed average PD)
• Maximize: w  PD  PD /  PD  PD 
 K  K 1
1
w2  ...  wK 1  0
• Minimize:


w   PD  PD  /  PD
w   PD  PD  /  PD
wK  PD  PD1 /  PDK  PD1 
i
i 1
wj  0
i 1
i
i 1
 PDi 
i 1
 PDi 
otherwise
PDi  PD  PDi 1
Implication: “Better” ARs may be obtained through
increased “sampling” from the two extreme buckets
Impact of Correlation in Large Pool Limit
Simple single factor Gaussian Copula correlation model
Main result: Increasing correlation improves AR for large
pools, except for maximum correlation of 100%
Impact of Correlation on Finite Pools
Simulation method: Estimate mean AR over 100,000 trials
• For finite pool there is a non-linear relationship between AR
and default rates, ED(AR) ≠ AR(ED)
• Main observation: Correlation improves mean AR for
heterogeneous (in PD) portfolios but is detrimental to
homogenous portfolios
Ideas for a New Perfomance
Measure
Desirable Properties
• A better performance measure for a PD model
should focus on the correctness of the PD
estimate relative to true PD, and not be skewed
by the nature of the sample population.
• The evaluation of a true model (firms default
with exactly modeled PD frequency) would
received close to perfect score.
• Worst score would require significant
mislabeling of (almost) guaranteed defaulters
and survivors.
• Ideally, correlation of defaulters in sample would
be taken into account.
A Few Ideas:
• Measure difference between true PD and model
s  1  Ex | PDTrue ( x)  PDModel ( x) |
1
s  1
N
N
| X
i 1
i
 PDModel ( xi ) |
• Measure difference in PD distributions
1
s  1   | FTrue ( PD)  FModel ( PD) | dPD
0
• Measure likelihood of ‘Portfolio of Defaults’
•
Test whether number of defaults is consistent with a
correlated portfolio model based on model PDs.
Conditional Default PD Distribution Tests
PD distribution:
with density function:
PD distribution Given Default:
PD distribution Given No Default:
with density function
with density function
TPR (true positive rate):
FPR (false positive rate):
And AUC can be expressed as:
1

AUC     f D  s  ds  f N  x  dx
0 x

1
Conditional Default PD Distribution Tests
Assuming the PDs are the true default rate, by Bayes’ Theorem it can be
shown that
x
FD  x  
xF ( x)   F ( s )ds
0
PD
x
FN  x  
Note that AUC can be calculated as:
(1  x) F ( x)   F ( s )ds
0
1  PD
Conditional Default PD Distribution Tests
• Given a PD model, you can either compute F ( x) or estimate the
distribution from the sample PDs.
• From F ( x ) you can compute FD ( x )
• From the observed defaulters, you can compute the observed
distribution of PDs conditional on default:
F D ( x)
• Apply the Kolmagorov Smirnov test, with n being the number of
total observations and m being the number of defaulters, to :
Dn ,m  sup x | FD ( x)  FD ( x) |
Thank You
William Morokoff
Head of Quantitative Analytics
T: 212.438.4828
william.morokoff@standardandpoors.com
Permission to reprint or distribute any content from this presentation requires the prior written approval of
Standard & Poor’s. Copyright © 2014 by Standard & Poor’s Financial Services LLC. All rights reserved.
Copyright © 2014 by Standard & Poor’s Financial Services LLC. All rights reserved.
No content (including ratings, credit-related analyses and data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified,
reverse engineered, reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard & Poor’s
Financial Services LLC or its affiliates (collectively, S&P). The Content shall not be used for any unlawful or unauthorized purposes. S&P and any third-party providers, as well as their
directors, officers, shareholders, employees or agents (collectively S&P Parties) do not guarantee the accuracy, completeness, timeliness or availability of the Content. S&P Parties
are not responsible for any errors or omissions (negligent or otherwise), regardless of the cause, for the results obtained from the use of the Content, or for the security or
maintenance of any data input by the user. The Content is provided on an “as is” basis. S&P PARTIES DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OR USE, FREEDOM FROM BUGS,
SOFTWARE ERRORS OR DEFECTS, THAT THE CONTENT’S FUNCTIONING WILL BE UNINTERRUPTED OR THAT THE CONTENT WILL OPERATE WITH ANY SOFTWARE
OR HARDWARE CONFIGURATION. In no event shall S&P Parties be liable to any party for any direct, indirect, incidental, exemplary, compensatory, punitive, special or
consequential damages, costs, expenses, legal fees, or losses (including, without limitation, lost income or lost profits and opportunity costs or losses caused by negligence) in
connection with any use of the Content even if advised of the possibility of such damages.
Credit-related and other analyses, including ratings, and statements in the Content are statements of opinion as of the date they are expressed and not statements of fact. S&P’s
opinions, analyses and rating acknowledgment decisions (described below) are not recommendations to purchase, hold, or sell any securities or to make any investment decisions,
and do not address the suitability of any security. S&P assumes no obligation to update the Content following publication in any form or format. The Content should not be relied on
and is not a substitute for the skill, judgment and experience of the user, its management, employees, advisors and/or clients when making investment and other business decisions.
S&P does not act as a fiduciary or an investment advisor except where registered as such. While S&P has obtained information from sources it believes to be reliable, S&P does not
perform an audit and undertakes no duty of due diligence or independent verification of any information it receives.
To the extent that regulatory authorities allow a rating agency to acknowledge in one jurisdiction a rating issued in another jurisdiction for certain regulatory purposes, S&P reserves
the right to assign, withdraw or suspend such acknowledgement at any time and in its sole discretion. S&P Parties disclaim any duty whatsoever arising out of the assignment,
withdrawal or suspension of an acknowledgment as well as any liability for any damage alleged to have been suffered on account thereof.
S&P keeps certain activities of its business units separate from each other in order to preserve the independence and objectivity of their respective activities. As a result, certain
business units of S&P may have information that is not available to other S&P business units. S&P has established policies and procedures to maintain the confidentiality of certain
non-public information received in connection with each analytical process.
S&P may receive compensation for its ratings and certain analyses, normally from issuers or underwriters of securities or from obligors. S&P reserves the right to disseminate its
opinions and analyses. S&P's public ratings and analyses are made available on its Web sites, www.standardandpoors.com (free of charge), and www.ratingsdirect.com and
www.globalcreditportal.com (subscription), and may be distributed through other means, including via S&P publications and third-party redistributors. Additional information about our
ratings fees is available at www.standardandpoors.com/usratingsfees.
STANDARD & POOR’S, S&P, GLOBAL CREDIT PORTAL and RATINGSDIRECT are registered trademarks of Standard & Poor’s Financial Services LLC.