Efficient Quantification of Model Risk

advertisement
How Wrong is your Model?
Efficient Quantification of Model Risk
Advanced Statistical Methods in Credit Risk
Royal Statistical Society
Alan Forrest, RBS Group
London, 13th June 2013
Information Classification – PUBLIC
Disclaimer
Thanks
With many thanks to the Credit Research Centre, University of Edinburgh Business School for
supporting me as a Visiting Scholar November and December 2012; and to the Royal Bank of Scotland
Group for granting me Special Leave to visit the CRC.
Disclaimer
The opinions expressed in this document are solely the author’s and do not necessarily reflect those of
The Royal Bank of Scotland Group or any of its subsidiaries.
Any graphs or tables shown are based on mock data and are for illustrative purposes only.
2
Overview
Model Risk Principles and Background

Model Risk – an emerging and powerful idea in bank regulation and credit risk
model management.

Model risk assessment needs quick and quantified sensitivity analysis.
Geometry and Information

Model specification and model sensitivity analysis can be presented
geometrically within a classical, mathematically rich theory.
Efficient Sensitivity Analysis

Insight to the modelling process.

Practical strategies for sensitivity analysis and managing model risk.
3
Model Risk Background
The US Regulator (Fed / OCC)

“The use of models invariably presents model risk, which is the potential for
adverse consequences from decisions based on incorrect or misused model
outputs and reports.”
The Statisticians (George Box, Norman Draper)

“Essentially, all models are wrong, but some are useful.”
The Logicians (attr John Maynard Keynes, but in fact Carveth Read)

“It is better to be roughly right than precisely wrong.”
4
Model Risk Background
FSA - Turner Review - March 2009

“Misplaced reliance on sophisticated maths”
– Too many pricing and lending decisions did not take into account the assumptions and
limitations of the models.
BoE - The Dog and the Frisbee – Haldane, August 2012

“… opacity and complexity… It is close to impossible to tell whether results from
[internal risk models] are prudent.”
– If we cannot say why we trust a model, are we right to use it?
Fed / OCC 2011-12a

“Model Risk should be managed like other types of risk.”
5
Model Risk Background
Sensitivity Analysis

How different would the model be if … ?
–
–
–
–
–

The missing data were filled in in a different way
A different time-period was used to develop the model
We enter a downturn or stressed period
An expert panel decision happened to turn out differently
Etc.
This is the key to quantitative model risk, but it is hard work
– Then combinations of these sensitivities…
– Model fitting is a multi-stage process…
– Sensitivity impact (eg capital, provisions) is complex and multidimensional…

Is this work out of proportion to the benefits?

Can we do sensitivities quickly? Without refitting models?
6
Geometry and Information
Describing Data

A model is a description of the development data.
– Model developers chose one of these descriptions for use in a decision.
– Model Risk considers the degree to which this choice could influence the decision.

This talk considers frequency histogram or contingency table descriptions of
models and data (Kullback, Centsov).
– Discrete buckets with sample frequencies (proportions, not counts).
– This is not a practical limitation: data elements are often classified and always finite precision.
– We should not restrict too early the breadth of models that could be chosen.

This defines a high but finite-dimensional dimensional space whose coordinate
values are the frequency in each bucket.
– The observed data is a single point in this space.
– Models are also “data” points: points that are preferred for use or descriptive
convenience.
– Dimension is the number of buckets, with one constraint: sum to 1 .
7
Geometry and Information
Illustration of a 3-cell model space: 2 dimensional simplex
8
Geometry and Information
Data, model subspace and model choice
Model Subspace
Model chosen “closest to the data”
Data
9
Geometry and Information
Sensitivity, model space flatness, over-sensitivity and over-fitting.
Just right
Models
Original
data
Modified
data
“Type 3 (or type 0) Error”
Over-sensitive / discontinuous
Right answer to the wrong question
Over-fitting
10
Geometry and Information
Fitting a model to data: a log-linear example mw = ceaw
Prior
MLE fit
Data
11
Geometry and Information
Likelihood maximisation and information

Some notation
– d is the data point – vector (dw) of frequencies, indexed by bucket labels w ,
– m is the model point (mw)
– N is the sample size

Sw dw =1.
Log Likelihood of d, given m, is log L = - N *( I(d,m) + H(d) )
where
I(d,m)
H(d)
= Sw dw log(dw / mw)
= - Sw dw log(dw),
Kullback-Leibler divergence
Shannon Entropy of the data.
– Maximising likelihood is the same as minimising KL divergence.
– but KL divergence is not a metric.

Following this line gives principles for practical inference, model selection etc..
– Akaike’s Information Criterion follows from this point of view. The Bayes Information
Criterion also follows, assuming different priors and sample asymptotics.
12
Geometry and Information
Dual optimisation principles

For exponential families of models:
– The blue spaces are linear.
– The model space is generated “tropically”.

Principle of Maximum Likelihood
– The model that maximises likelihood is
where the data’s blue space hits the red
model space.

Principle of Maximum Entropy (uniform
prior)
– Within each blue space, the red point
maximises entropy, H(x) .

Principles of Inference
– if m is the MLE fit to data d, and m’ is
another model from the model space,
then I(d,m’) = I(d,m) + I(m,m’) .
13
Geometry and Information
Two ways of moving distributions – taxic and tropic – shifts and tilts

Tropic movements generated by groups characterise exponential families of models
Taxic movement f(x)  f(x-t)
Tropic movement f(x)  g(x) f(x)
Uses a group action on the state space.
A group of functions acting multiplicatively on
probability distributions.
Renormalising after each action
14
Geometry and Information
Dual Foliations (Amari, Critchley et al.)

Assume a tropic structure.
– Equivalently, decide what direction to give
the blue subspaces.

Each model space is generated
tropically from a prior m0 .
– Maximum likelihood principle still applies.

Modified principle of maximum entropy:
– Within each blue space, red points
minimise I(x,m0) .

Principle of Inference still holds

Euclidean metric ds2 = Sw dxw2
the natural geometry.
is not
15
Geometry and Information
Spherical Geometry

Another metric ds2 = Sw dxw2 / xw
appears more natural.
– Locally equivalent to KL divergence: ds2
= 2 I(x,x+dx) = 2 I(x+dx,x) , up to third
order terms.
– “Local Chi-squared”.
– “Boot-strapping geometry”.

The model fitting foliations are
orthogonal in this metric.
– The model space curvature is true.

Isometric to a portion of a sphere.
– 2x = u2 connects this space
isometrically with Euclidean space.
16
Efficient Sensitivity Analysis
Data shifts give rise to model shifts
ds2(model) <= Const * ds2(data)

Seek a distance constraint:

Seek understanding of the map: d(data)  d(model)
– d(SQRT(data))  d(SQRT(model))
– d(LOG(data))  d(LOG(model))
– Linear, singular
– Expansive or contractive directions?
– Twists?
Data shifted
Data
Model
Model shifted
17
Efficient Sensitivity Analysis

More generally, seek conditions for contractive sensitivities.
– These are the “flat” directions in the model space – good for model risk.
– Some model families contract for all data-shifts, eg a model that groups two classes.

In context of logistic models, situations where sensitivities contract include:
– Full interaction descriptions, eg population shifts (i.e. independent of default event);
– Class amalgamation in dummy factors.
18
Efficient Sensitivity Analysis

Some general results are possible:

For an exponential family of models generated by dummy factors from a
uniform prior: If m is the MLE model fitted to data d, then
ds2(model) <= eI(d,m) ds2(data) .

Outline of proof:
– A sequence of MLE models take us from d to m, whose nested model spaces drop
one dimension at each step, d=m0 … mt=m.
– The conditions assumed allow us to arrange that each step in the model sequence
removes one dummy factor at a time.
– The simplified 2-by-K context allows the direct calculation
2
ds (mi+1) <= e
I(mi,mi+1)
2
ds (mi)
– However, inference and the MLE conditions give us
– QED

I(d,m) = I(d,m1) + I(m1,m2) + … + I(mk-1,m)
Conjecture this is true for general exponential families and general priors.
19
Efficient Sensitivity Analysis
A sensitivity principle implied by boot-strapping

For large development samples, the standard error ellipsoid is sufficient to
describe model sensitivity to data shifts.
Data Bootstrapping
Ellipsoid = ds-ball
Model Standard Error
Ellipsoid = image of
data ds-ball
Scale = Chi-sq (df = dimension) / (2N)
Model Prediction Error
Ellipsoid = ds-ball
20
Implications for Model Risk
Managing Sensitivities

Sensitivity analysis asks “what if the data was different?” Now we can connect
model changes to data movements geometrically.

Sensitivities can be prioritised by the (most generous) metric distance of its
specified data change, ds2(data) .
– Sometimes it is simpler to prioritise on the divergence, which is numerically a good
approximation to the distance.

Establish a limit to the size of data sensitivity, below which material changes of
the model clearly will not happen.

Convert model variation, measured by ds2(model), into a simple business
measure, as a communicable measure of model specification risk.
– For example, conservatism equivalent, swap-set size, effect on gini, provisions impact
or RWA impact.
21
Implications for Model Risk
Example

A model risk with default identification has been identified for a logistic scorecard model. A
sensitivity analysis has been proposed: How different could the model be if the default
counts were undercounted?

Assume a recorded 5% default rate, with possible additional defaults up to 5.5%. A
2
general upper bound for data ds in this case is about 0.005 .
– A sharper upper bound could be achieved knowing more about the factor distributions.

This prioritizes this sensitivity analysis in the model’s management.
– All things being equal, it will come after a check on missing values accuracy (est. ds 2 = 0.2) and
before a check on the robustness of classing another factor (est. ds 2 = 0.003).

With a sample size of 10,000 , and data space of dimension 100, this gives a 95%-ile
2
bootstrap ds radius of about 0.01 (1/20000 *chi-sq(df=100) ). Therefore the sensitivity
proposed is about 1/2 the strength of a standard error 95% significance movement.

Conclusion: assuming this model is built on robust use of standard errors, this sensitivity
should cause no change to the significance of the factors selected, but could change the
factor selection and the size of some of the parameters.
22
Conclusions
Model Risk Principles

Model risk is an important and growing area of banks’ risk management.

The key to quantitative model risk assessment is sensitivity analysis, and

The key to practical sensitivity analysis is a quick, effective method to gauge
model variation without having to rebuild models.
Efficient Sensitivity Analysis and Model Risk Managment

Classical ideas of statistical geometry and information theory add insight to the
process of model fitting,

Sensitivity analysis is framed as a differential data-shift problem.

This approach to sensitivity analysis develops fully general principles, and
practical methods that cut resource and by-pass the need to refit models.
23
Download