General Iteration Algorithms

advertisement
General Iteration Algorithms
by
Luyang Fu, Ph. D., State Auto Insurance Company
Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP
2007 CAS Annual Meeting
Chicago, Nov. 11-14, 2007
1
Agenda

History and Overview of Minimum Bias Method

General Iteration Algorithms (GIA)

Demonstration of a GIA Tool

GIA Statistics

Conclusions

Q&A
2
History on Minimum Bias

A technique with long history for actuaries:






Bailey and Simon (1960)
Bailey (1963)
Brown (1988)
Feldblum and Brosius (2002)
A topic in CAS Exam 9
Concepts:


Derive multivariate class plan parameters by minimizing a
specified “bias” function
Use an “iterative” method in finding the parameters
3
History on Minimum Bias


Various bias functions proposed in the past for
minimization
Examples of multiplicative bias functions proposed in the
past:
Balanced Bias   wi , j (ri , j  xi y j )
i, j
Squared Bias   wi , j (ri , j  xi y j ) 2
i, j
Chi  Squared Bias  
i, j
wi , j (ri , j  xi y j ) 2
wi , j xi y j
4
History on Minimum Bias


Then, how to determine the class plan parameters by
minimizing the bias function?
One simple way is the commonly used an “iterative”
methodology for root finding:




Start with a random guess for the values of xi and yj
Calculate the next set of values for xi and yj using the root
finding formula for the bias function
Repeat the steps until the values converge
Easy to understand and can be programmed in almost
any tool
5
History on Minimum Bias

For example, using the balanced bias functions for
the multiplicative model:
Balanced Bias   wi , j (ri , j  xi y j )  0
i, j
Then,
w
r
i, j i, j
xˆi ,t 
j
w
i, j
yˆ j ,t 1
j
w

w
r
i, j i, j
yˆ j ,t
i
i, j
xˆi ,t 1
i
6
History on Minimum Bias

Past minimum bias models with the iterative method:
w

w
r
i, j i, j
xˆi ,t
j
i, j
xˆi ,t 
yˆ j ,t 1
j
xˆi ,t
xˆi ,t
  wi , j ri 2,j yˆ j ,1t 1 
 j


wi , j yˆ j ,t 1 

 j



ri , j
1
 
ˆ j ,t 1
n j y
2
w
 i , j ri , j yˆ j ,t 1
j
w
2
i, j
yˆ 2j ,t 1
j
1/ 2
 w r yˆ

 w yˆ
i, j i, j
xˆi ,t
j ,t 1
j
i, j
2
j ,t 1
j
7
Iteration Algorithm for Minimum Bias



wi , j ( ri , j  xi y j ) is not “bias”.
Theoretically, 
i, j
Bias is defined as the difference between an estimator and the
ˆ i is bias. If xi  xˆi  0 ,
true value. For example, xi  x
then xhat is an unbiased estimator of x.
To be consistent with statistical terminology, we name our
approach as General Iteration Algorithm.
8
Issues with the Iterative Method

Two questions regarding the “iterative” method:



Answers:



How do we know that it will converge?
How fast/efficient that it will converge?
Numerical Analysis or Optimization textbooks
Mildenhall (1999)
Efficiency is a less important issue due to the
modern computation power
9
Other Issues with Minimum Bias



What is the statistical meaning behind these models?
More models to try?
Which models to choose?
10
Summary on Historical Minimum Bias




A numerical method, not a statistical approach
Best answers when bias functions are minimized
Use of an “iterative” methodology for root finding in
determining parameters
Easy to understand and can be programmed in many
tools
11
Connection Between Minimum Bias and
Statistical Models

Brown (1988)



Show that some minimum bias functions can be derived
by maximizing the likelihood functions of corresponding
distributions
Propose several more minimum bias models
Mildenhall (1999)


Prove that minimum bias models with linear bias
functions are essentially the same as those from
Generalized Linear Models (GLM)
Propose two more minimum bias models
12
Connection Between Minimum Bias and
Statistical Models

Past minimum bias models and their corresponding statistical
models
w r

 w yˆ
xˆi ,t
i, j
 Poisson
j ,t 1
xˆi ,t
j
  wi , j ri 2,j yˆ j ,1t 1 
 j

2



wi , j yˆ j ,t 1 

 j



ri , j
1
 
 Exponential
n j yˆ j ,t 1
xˆi ,t
j ,t 1
j
2
i, j
2
j ,t 1
 Normal
j
1/ 2
xˆi ,t
 w r yˆ

 w yˆ
2
i, j i, j
i, j i, j
j
 w r yˆ

 w yˆ
i, j i, j
xˆi ,t
j ,t 1
j
i, j
2
j ,t 1
 Least Squared
j
13
Statistical Models - GLM

Advantages include:



Commercial software and built-in procedures available
Characteristics well determined, such as confidence level
Computation efficiency compared to the iterative procedure
14
Statistical Models - GLM

Issues include:


Requires more advanced knowledge of statistics for GLM
models
Lack of flexibility:

Reliance on commercial software / built-in procedures.

Cannot do the mixed model.

Assumes a pre-determined distribution of exponential families.

Limited distribution selections in popular statistical software.

Difficult to program from scratch.
15
Motivations for GIA




Can we unify all the past minimum bias models?
Can we completely represent the wide range of GLM and
statistical models using Minimum Bias Models?
Can we expand the model selection options that go beyond all
the currently used GLM and minimum bias models?
Can we fit mixed models or constraint models?
16
General Iteration Algorithm

Starting with the basic multiplicative formula

The alternative estimates of x and y:
ri , j  xi y j
xˆ i , j  ri , j / y j ,

j  1,
2, to
n
yˆ j ,i  ri , j / xi , i  1,
2, to
m,
The next question is – how to roll up xi,j to xi, and yj,i to yj ?
17
Possible Weighting Functions

First and the obvious option - straight average to roll
up
ˆi 
x
ˆj 
y

1
1
ˆi , j 

x

n
n
j

1
 1 
ˆ
y




j ,i
m
 m
i
j
ri , j
ˆj
y

i
ri , j
ˆi
x
Using the straight average results in the Exponential
model by Brown (1988)
18
Possible Weighting Functions

Another option is to use the relativity-adjusted exposure
as weight function




wi , j ri , j

 wi , j y
 wi , j y
ˆj 
ˆ j  ri , j
j
ˆi   
ˆi , j   
x

x

ˆ
ˆ
ˆ
ˆj
w
y
w
y
y
wi , j y
j  
j  

i, j
j 
i, j
j 
j
j
 j

 j





wi , j ri , j

ˆi 
ˆi  ri , j
 wi , j x
 wi , j x
ˆ j  
ˆ j ,i   
y
 i
y

ˆi 
ˆi  x
ˆi
ˆi
wi , j x
i   wi , j x
i   wi , j x

i
 i

 i


This is Bailey (1963) model, or Poisson model by Brown
(1988).
19
Possible Weighting Functions

Another option: using the square of relativity-adjusted
exposure

 wi2, j yˆ 2j
xˆi   
2
ˆ 2j
j   wi , j y
 j

 wi2, j xˆi2
yˆ j   
2
ˆi2
i   wi , j x
 i



xˆi , j 




 yˆ j ,i 


2
w
 i , j ri , j yˆ j
j
w
2
i, j
yˆ 2j
j
 w r xˆ
 w xˆ
2
i, j i, j
i
i
2
i, j
2
i
i
This is the normal model by Brown (1988).
20
Possible Weighting Functions

Another option: using relativity-square-adjusted exposure
ˆi
x
ˆj
y



ˆ 2j
wi , j y
 
ˆ 2j
j
  wi , j y
 j

ˆi2

wi , j x
 
ˆ i2
i
  wi , j x
 i


ˆi , j 
x




ˆ j ,i 
y


 w r yˆ
 w yˆ
i, j
i, j
j
j
i, j
2
j
j
 w r xˆ
 w xˆ
i, j
i, j
i
i
i, j
2
i
i
This is the least-square model by Brown (1988).
21
General Iteration Algorithms




So, the key for generalization is to apply different
“weighting functions” to roll up xi,j to xi and yj,i to yj
Propose a general weighting function of two factors,
exposure and relativity: WpXq and WpYq
Almost all published to date minimum bias models are
special cases of GIA(p,q)
Also, there are more modeling options to choose since
there is no limitation, in theory, on (p,q) values to try in
fitting data – comprehensive and flexible
22
2-parameter GIA

2-parameter GIA with exposure and relativity adjusted
weighting function are:

 wip, j y
ˆ qj
xˆi   
p
ˆ qj
j   wi , j y
 j

 wip, j xˆiq
ˆ j  
y
p
ˆiq
i   wi , j x
 i


xˆi , j 




ˆ j ,i 
y


w r
w
p
i, j i, j
yˆ qj 1
j
p
i, j
ˆ qj
y
j
 w r xˆ
 w xˆ
p
i, j i, j
q 1
i
i
p
i, j
q 1
i
i
23
2-parameter GIA vs. GLM
p
q
GLM
1
-1
Inverse Gaussian
1
0
Gamma
1
1
Poisson
1
2
Normal
24
2-parameter GIA and GLM


GIA with p=1 is the same as GLM model with the
variance function of V (  )   2q
Additional special models:




0<q<1, the distribution is Tweedie, for pure premium models
1<q<2, not exponential family
-1<q<0, the distribution is between gamma and inverse Gaussian
After years of technical development in GLM and
minimum bias, at the end of day, all of these models are
connected through the game of “weighted average”.
25
3-parameter GIA




One model published to date not covered by the 2parameter GIA: Chi-squared model by Bailey and Simon
(1960)
Further generalization using a similar concept of link
function in GLM, f(x) and f(y)
Estimate f(x) and f(y) through the iterative method
Calculate x and y by inverting f(x) and f(y)
26
3-parameter GIA

 wip, j yˆ qj
f ( xˆi )   
p
q
ˆ
w
y
j   i, j j
 j


 f (xˆi , j ) 


ri , j
j w yˆ f ( yˆ )
j
p
i, j
q
j
p
q
ˆ
w
y
 i, j j
j
ri , j
w xˆ f ( )


p
q

 wi , j xˆi 
xˆi
i
f ( yˆ j )   
f ( yˆ j ,i ) 
p
q 
p
q
ˆ
ˆ
w
x
w
x
i   i, j i 

i, j i
i
 i

p
i, j
q
i
27
3-parameter GIA

Propose 3-parameter GIA by using the power link
function f(x)=xk
qk
j
 w r y
ˆ
 j
ˆi  
x
p
q
ˆ
w
y


i, j
j
j






  wip, j ri k, j x
ˆiq  k
 i
ˆj 
y
p
q
ˆ
w
x

i, j i

i






p
k
i, j i, j
1/ k
1/ k
28
3-parameter GIA

When k=2, p=1 and q=1
ˆi
x





ˆ 1 
wi , j ri 2

, j y j

j

ˆ
w
y


i, j
j
j

ˆj
y





w r
w

1/ 2
i
i
1/ 2
ˆ i1 
x
i, j


ˆ
x
i, j
i


2
i, j
This is the Chi-Square model by Bailey and Simon
(1960)
29
Additive GIA

ˆi
x
p
w
 i , j ( ri , j  y j )
j
p
w
 i, j
j
ˆ
y
j

p
w
 i , j ( ri , j  xi )
i
p
w
 i, j
i
30
Mixed GIA

For commonly used personal line rating structures,
the formula is typically a mixed multiplicative and
additive model:

Price = Base*(X + Y) * Z
ˆi,
x
ˆ i,
y
ˆi ,
z
j ,h
j ,h
j ,h

ri ,

ri ,

j ,h
zh
j ,h
zh
ri ,
 y
j
 xi
j ,h
xi  y
j
31
Constraint GIA

In real world, for most of the pricing factors, the range of
their values are capped due to market and regulatory
constraints
 w r y
 j
xˆ1  
p
q
  w1, j y j
j

p
k
1, j 1, j
q k
j





1/ k
 w r y
 j
xˆ 2  max( 0.75 xˆ1 , min( 0.95 xˆ1 , 
p
q
w
y
  2, j j
j

p
k
2, j 2, j
q k
j





1/ k
))
32
Numerical Methodology for GIA

For all algorithms:





Use the mean of the response variable as the base
Starting points:1 for multiplicative factors; 0 for additive factors
Use the latest relativities in the iterations
All the reported GIAs converge within 8 steps for our test examples
For mixed models:


In each step, adjust multiplicative factors from one rating variable
proportionally so that its weighted average is one.
For the last multiplicative variable, adjust its factors so that the
weighted average of the product of all multiplicative variables is one.
33
Demonstration of a GIA Tool



Written in VB.NET and runs on Windows PCs
Approximately 200 hours for tool development
Efficiency statistics:
Efficiency for different test cases
Excel Date
# of Records
# of Variables
Loading Time
CSV Date
Model Time
Loading Time
Model Time
508
3
0.7 sec
0.5 sec
0.1 sec
0.5 sec
25,904
6
2.8 sec
6 sec
1 sec
5 sec
48,517
13
9 sec
50 sec
1.5 sec
50 sec
642,128
49
N/A
40 sec
34
GIA Statistics: Bias Function

GLM Bias Function, Mildenhall (1999):
Z 'W ( r   )  0
Z: the design matrix
h(x): the reverse function of link function
v(x): the variance function of GLM
W: the diagonal matrix of weights with the ith
diagonal element of wk h' ( z k  ) / V ( k )
35
GIA Statistics: Bias Function

For multiplicative GLM:



Link function is log, so h(x)=exp(x)
Variance function v(x)=x^c,
The bias function of multiplicative GLM:
 w  (r   )  0 for i=1 to m
 w  (r   )  0 for j=1 to n
n
j 1
i, j
1c
i, j
i, j
i, j
i, j
1 c
i, j
i, j
i, j
n
i 1
36
GIA Statistics

The bias function of GIA:
w
w

P
i, j
 iq, j k (ri k, j   ik, j )  0
P
i, j
iq, j k (ri k, j  ik, j )  0
j

i
w
P
i, j
j
for j=1 to n
for i=1 to m
 iq, j k (ri k, j   ik, j )  0   wiP, j y qj  k (ri k, j  xik y kj )  0
j
  wip, j ri k, j y qj  k
 j
P
q k k
k
p
q
  wi , j y j ri , j  xi  wi , j y j  0  xˆ i  
p
q
j
j
  wi , j y j
j






1/ k
37
GIA Statistics


3-parameter GIA is equivalent to a GLM assuming
k
r
the response variable follows a distribution with
variance function V ( )   2q / k
GIA theory reveals that the underlying assumption of
Chi-Square model is that r2 follows a Tweedie
distribution with a variance function V ( )   .
1.5
38
GIA Statistics: Residual Diagnosis

Scaled Pearson residual of GIA:
ei , j 
ri k, j  rˆi k
Var (  )
Var (ei , j )  Var (

ri k, j  xˆ ik yˆ kj
( xˆ yˆ )
k
i
k
j
2q / k
ri k, j  xˆik yˆ kj
( xˆik yˆ kj ) 2q / k

ri k, j  xˆ ik yˆ kj
xˆ i2 k  q yˆ 2j k  q
Var (ri k, j )
)  k k 2 q / k  1
( xˆi yˆ j )
39
GIA Statistics: Residual Diagnosis
0.6
0.4
0.2
-0.2 0.0
Scaled Pearson Residuals
0.8
Scattered residual plot of GIA with k = 1, p = 1, and q = -0.5
0
5
10
15
20
25
30
Observations
40
GIA Statistics: Residual Diagnosis
0.6
0.4
0.2
-0.2 0.0
Scaled Pearson Residuals
0.8
Scattered residual plot of GIA by Driver Age with k = 1, p = 1, and q = -0.5
17-20
25-29
35-39
50-59
60+
Age
41
GIA Statistics: Residual Diagnosis
0.6
0.4
0.2
0.0
-0.2
Scaled Pearson Residuals
0.8
Scattered residual plot of GIA by Vehicle Use with k = 1, p = 1, and q = -0.5
Business
DriveLong
DriveShort
Pleasure
Use
42
GIA Statistics: Residual Diagnosis
0.6
0.4
0.2
0.0
-0.2
Scaled Pearson Residuals
0.8
Residual Q-Q plot of GIA with k = 1, p = 1, and q = -0.5
-2
-1
0
1
2
Quantiles of Standard Normal
43
GIA Statistics: Calculation Efficiency

GLM: Iterative reweighted least-square algorithm
Beta  ( Z 'WZ ) 1 Z 'Wr


In each iteration, GLM coefficients are calculated
using the weight matrix from the last iteration.
GIA: Fixed Point Iteration
GLM is faster, GIA is not slow.
44
Conclusions






2 and 3 Parameter GIA can completely represent GLM and
minimum bias models
Can fit mixed models and models with constraints
Provide additional model options for data fitting
Easy to understand and does not require advanced statistical
knowledge
Can program in many different tools
Calculation efficiency is not an issue because of modern
computer power.
45
Q&A
46
Download