Francis Analytics Actuarial Data Mining Services Predictive Modeling CAS Reinsurance Seminar May 7, 2007 Louise Francis, FCAS, MAAA Louise.francis@data-mines.com Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Francis Analytics Why Predictive Modeling? Actuarial Data Mining Services • Better use of data than traditional methods • Advanced methods for dealing with messy data now available Francis Analytics www.data-mines.com 2 Francis Analytics Data Mining Goes Prime Time Francis Analytics www.data-mines.com Actuarial Data Mining Services 3 Francis Analytics Becoming A Popular Tool In All Industries Francis Analytics www.data-mines.com Actuarial Data Mining Services 4 Francis Analytics Real Life Insurance Application – The “Boris Gang” Francis Analytics www.data-mines.com Actuarial Data Mining Services 5 Francis Analytics Predictive Modeling Family Actuarial Data Mining Services Predictive Modeling Classical Linear Models Francis Analytics www.data-mines.com GLMs Data Mining 6 Francis Analytics Data Quality: A Data Mining Problem Actuarial Data Mining Services • Actuary reviewing a database Francis Analytics www.data-mines.com 8 Francis Analytics A Problem: Nonlinear Functions Actuarial Data Mining Services An Insurance Nonlinear Function: Provider Bill vs. Probability of Independent Medical Exam 0.90 0.80 Value Prob IME 0.70 0.60 0.50 0.40 0.30 11368 2540 1805 1450 1195 989 821 683 560 450 363 275 200 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Provider 2 Bill Francis Analytics www.data-mines.com 10 Classical Statistics: Regression Francis Analytics Actuarial Data Mining Services • Estimation of parameters: Fit line that minimizes deviation between actual and fitted values min( (Yi Y ) ) 2 Workers Comp Severity Trend $10,000 Severity $8,000 $6,000 $4,000 $2,000 $1990 1992 1994 1996 1998 2000 2002 2004 Year Severity Francis Analytics www.data-mines.com Fitted Y 11 Generalized Linear Models Common Links for GLMs Francis Analytics Actuarial Data Mining Services The identity link: h(Y) = Y The log link: h(Y) = ln(Y) The inverse link: h(Y) = 1 Y Y ln( ) The logit link: h(Y) = 1Y The probit link: h(Y) = (Y ), denotes Francis Analytics www.data-mines.com the normal CDF 13 Francis Analytics Major Kinds of Data Mining • Supervised learning – Most common situation – A dependent variable • Frequency • Loss ratio • Fraud/no fraud – Some methods • Regression • CART • Some neural networks • Actuarial Data Mining Services Unsupervised learning – No dependent variable – Group like records together • A group of claims with similar characteristics might be more likely to be fraudulent • Ex: Territory assignment, Text Mining – Some methods • Association rules • K-means clustering • Kohonen neural networks Francis Analytics www.data-mines.com 14 Francis Analytics Desirable Features of a Data Mining Method Actuarial Data Mining Services • Any nonlinear relationship can be approximated • A method that works when the form of the nonlinearity is unknown • The effect of interactions can be easily determined and incorporated into the model • The method generalizes well on out-of sample data Francis Analytics www.data-mines.com 15 Francis Analytics The Fraud Surrogates used as Dependent Variables Actuarial Data Mining Services • Independent Medical Exam (IME) requested • Special Investigation Unit (SIU) referral – (IME successful) – (SIU successful) • Data: Detailed Auto Injury Claim Database for Massachusetts • Accident Years (1995-1997) Francis Analytics www.data-mines.com 16 Francis Analytics Predictor Variables Actuarial Data Mining Services • Claim file variables – Provider bill, Provider type – Injury • Derived from claim file variables – Attorneys per zip code – Docs per zip code • Using external data – Average household income – Households per zip Francis Analytics www.data-mines.com 17 Francis Analytics Different Kinds of Decision Trees Actuarial Data Mining Services • Single Trees (CART, CHAID) • Ensemble Trees, a more recent development (TREENET, RANDOM FOREST) – A composite or weighted average of many trees (perhaps 100 or more) Francis Analytics www.data-mines.com 18 Francis Analytics Non Tree Methods Actuarial Data Mining Services • MARS – Multivariate Adaptive Regression Splines • Neural Networks • Naïve Bayes (Baseline) • Logistic Regression (Baseline) Francis Analytics www.data-mines.com 19 Francis Analytics Classification and Regression Trees (CART) Actuarial Data Mining Services • Tree Splits are binary • If the variable is numeric, split is based on R2 or sum or mean squared error – For any variable, choose the two way split of data that reduces the mse the most – Do for all independent variables – Choose the variable that reduces the squared errors the most • When dependent is categorical, other goodness of fit measures (gini index, deviance) are used Francis Analytics www.data-mines.com 21 CART – Example of 1st split on Provider 2 Bill, With Paid as Dependent Francis Analytics Actuarial Data Mining Services 1st Split All Data Mean = 11,224 Bill < 5,021 Bill>= 5,021 Mean = 10,770 Mean = 59,250 • For the entire database, total squared deviation of paid losses around the predicted value (i.e., the mean) is 4.95x1013. The SSE declines to 4.66x1013 after the data are partitioned using $5,021 as the cutpoint. • Any other partition of the provider bill produces a larger SSE than 4.66x1013. For instance, if a cutpoint of $10,000 is selected, the SSE is 4.76*1013. Francis Analytics www.data-mines.com 22 Continue Splitting to get more homogenous groups at terminal nodes Francis Analytics Actuarial Data Mining Services mp2.bill<544.5 | mp2.bill<3.5 mp2.bill<4055.5 mp2.bill<1443.5 0.02254 mp2.bill<16659 0.04817 mp2.bill<5151.5 0.07767 0.08832 0.06980 0.11480 Francis Analytics www.data-mines.com 0.13330 23 Ensemble Trees: Fit More Than One Tree Francis Analytics Actuarial Data Mining Services • Fit a series of trees • Each tree added improves the fit of the model • Average or Sum the results of the fits • There are many methods to fit the trees and prevent overfitting • Boosting: Iminer Ensemble and Treenet • Bagging: Random Forest Francis Analytics www.data-mines.com 25 Treenet Prediction of IME Requested Francis Analytics Actuarial Data Mining Services 0.90 0.80 Value Prob IME 0.70 0.60 0.50 0.40 0.30 11368 2540 1805 1450 1195 989 821 683 560 450 363 275 200 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Provider 2 Bill Francis Analytics www.data-mines.com 27 Francis Analytics Neural Networks Actuarial Data Mining Services Three Layer Neural Network = Input Layer (Input Data) Francis Analytics www.data-mines.com Hidden Layer (Process Data) Output Layer (Predicted Value) 29 Neural Networks Francis Analytics Actuarial Data Mining Services • Also minimizes squared deviation between fitted and actual values • Can be viewed as a non-parametric, non-linear regression Francis Analytics www.data-mines.com 31 Hidden Layer of Neural Network (Input Transfer Function) Francis Analytics Actuarial Data Mining Services Logistic Function for Various Values of w1 1.0 0.8 w1=-10 w1=-5 w1=-1 w1=1 w1=5 w1=10 0.6 0.4 0.2 0.0 X -1.2 Francis Analytics www.data-mines.com -0.7 -0.2 0.3 0.8 32 The Activation Function (Transfer Function) Francis Analytics Actuarial Data Mining Services • The sigmoid logistic function f (Y ) 1 1 e Y Y w0 w1 X1 w2 X 2 ... wn X n Francis Analytics www.data-mines.com 33 Neural Network: Provider 2 Bill vs. IME Requested Francis Analytics 0.12 0.10 0.08 0.06 0.04 Fitted Neural Net Prediction 0.14 Actuarial Data Mining Services 0 20000 40000 60000 80000 100000 Privider 2 Bill Francis Analytics www.data-mines.com 34 Francis Analytics MARS: Provider 2 Bill vs. IME Requested Actuarial Data Mining Services MARS Predicted IME 0.13 0.11 0.09 0.07 0.05 0 1000 Francis Analytics www.data-mines.com 2000 3000 Provider 2 Bill 4000 35 Francis Analytics How MARS Fits Nonlinear Function Actuarial Data Mining Services • MARS fits a piecewise regression – BF1 = max(0, X – 1,401.00) – BF2 = max(0, 1,401.00 - X ) – BF3 = max(0, X - 70.00) – Y = 0.336 + .145626E-03 * BF1 - .199072E-03 * BF2 - .145947E-03 * BF3; BF1 is basis function – BF1, BF2, BF3 are basis functions • MARS uses statistical optimization to find best basis function(s) • Basis function similar to dummy variable in regression. Like a combination of a dummy indicator and a linear independent variable Francis Analytics www.data-mines.com 36 Baseline Method: Naive Bayes Classifier Francis Analytics Actuarial Data Mining Services • Naive Bayes assumes feature (predictor variables) independence conditional on each category • Probability that an observation X will have a specific set of values for the independent variables is the product of the conditional probabilities of observing each of the values given target category cj, j=1 to m (m typically 2) P( X1 , X 2 ... X n | c j ) P( X i xi | c j ) i where X1 , X 2 ... X n are specific values for the independent variables Francis Analytics www.data-mines.com 39 Francis Analytics Naïve Bayes Formula P(C j | X1, X 2 ...X N ) Actuarial Data Mining Services p(C c j , X1, X 2 ... X N ) P( X1 , X 2 ... X n) (Bayes Rule) p(C c j ) P( X i | c j ) P(C j | X1, X 2 ...X N ) i P( X1 , X 2 ... X n) A constant Francis Analytics www.data-mines.com 40 Francis Analytics Advantages/Disadvantages Actuarial Data Mining Services • Computationally efficient • Under many circumstances has performed well • Assumption of conditional independence often does not hold • Can’t be used for numeric variables Francis Analytics www.data-mines.com 44 Francis Analytics Naïve Bayes Predicted IME vs. Provider 2 Bill Actuarial Data Mining Services 0.140000 Mean Probability IME 0.120000 0.100000 0.080000 0.060000 13767 9288 7126 5944 5200 4705 4335 4060 3805 3588 3391 3196 3042 2895 2760 2637 2512 2380 2260 2149 2050 1945 1838 1745 1649 1554 1465 1371 1285 1199 1110 1025 939 853 769 685 601 517 433 349 265 181 97 0 Provider 2 Bill Francis Analytics www.data-mines.com 45 True/False Positives and True/False Negatives (Type I and Type II Errors) The “Confusion” Matrix Francis Analytics Actuarial Data Mining Services • Choose a “cut point” in the model score. • Claims > cut point, classify “yes”. Sample Confusion Matrix: Sensitivity and Specificity True Class Prediction No Yes Column Total Sensitivity Specificity Francis Analytics www.data-mines.com No 800 200 1,000 Correct 800 400 Yes 200 400 600 Row Total 1,000 600 Total Percent Correct 1,000 80% 600 67% 46 Francis Analytics ROC Curves and Area Under the ROC Curve Actuarial Data Mining Services • Want good performance both on sensitivity and specificity • Sensitivity and specificity depend on cut points chosen – Choose a series of different cut points, and compute sensitivity and specificity for each of them – Graph results • Plot sensitivity vs 1-specifity • Compute an overall measure of “lift”, or area under the curve Francis Analytics www.data-mines.com 47 TREENET ROC Curve – IME Explain AUROC AUROC = 0.701 Francis Analytics www.data-mines.com Francis Analytics Actuarial Data Mining Services 48 Ranking of Methods/Software – IME Requested Francis Analytics Actuarial Data Mining Services Method/Software AUROC Lower Bound Upper Bound Random Forest 0.7030 0.6954 0.7107 Treenet 0.7010 0.6935 0.7085 MARS 0.6974 0.6897 0.7051 SPLUS Neural 0.6961 0.6885 0.7038 S-PLUS Tree 0.6881 0.6802 0.6961 Logistic 0.6771 0.6695 0.6848 Naïve Bayes 0.6763 0.6685 0.6841 SPSS Exhaustive CHAID 0.6730 0.6660 0.6820 CART Tree 0.6694 0.6613 0.6775 Iminer Neural 0.6681 0.6604 0.6759 Iminer Ensemble 0.6491 0.6408 0.6573 Iminer Tree 0.6286 0.6199 0.6372 Francis Analytics www.data-mines.com 50 Francis Analytics Some Software Packages That Can be Used Actuarial Data Mining Services Excel Access Free Software R Web based software S-Plus (similar to commercial version of R) SPSS CART/MARS Data Mining suites – (SAS Enterprise Miner/SPSS Clementine) Francis Analytics www.data-mines.com 51 Francis Analytics References Actuarial Data Mining Services • Derrig, R., Francis, L., “Distinguishing the Forest from the Trees: A Comparison of Tree Based Data Mining Methods”, CAS Winter Forum, March 2006, WWW.casact.org • Derrig, R., Francis, L., “A Comparison of Methods for Predicting Fraud ”,Risk Theory Seminar, April 2006 • Francis, L., “Taming Text: An Introduction to Text Mining”, CAS Winter Forum, March 2006, WWW.casact.org • Francis, L.A., Neural Networks Demystified, Casualty Actuarial Society Forum, Winter, pp. 254-319, 2001. • Francis, L.A., Martian Chronicles: Is MARS better than Neural Networks? Casualty Actuarial Society Forum, Winter, pp. 253-320, 2003b. • Dahr, V, Seven Methods for Transforming Corporate into Business Intelligence, Prentice Hall, 1997 • The web site WWW.data-mines.com has some tutorials and presentations 52 Francis Analytics www.data-mines.com Francis Analytics Actuarial Data Mining Services Predictive Modeling CAS Reinsurance Seminar May, 2006 Louise.francis@data-mines.com www.data-mines.com