Credit Rating Analysis with Support Vector Machines and Neural Networks: A Market Comparative Study Zan Huang, Hsinchun Chen, Chia-jung Hsu, Andy Chen, Soushan Wu AI Seminar Artificial Intelligence Lab The University of Arizona 08/16/2002 Agenda • • • • • • • • Introduction Credit Risk Analysis Literature Review Research Questions Analytical Methods Data Sets Experiments Results and Analysis Discussion and Future Directions Introduction Credit Rating • Credit Rating is valuable information – Widely used measure for the riskiness of the companies and bonds • Credit Rating is expensive information – Costly to obtain • Credit Rating prediction is important – For investors: estimate riskiness of unrated companies – For companies: monitor the companies’ credit rating, predict the future rating. Credit Rating Prediction • Rating agencies: subjective judgment is important, not predictable. • Researchers: satisfactory results have been obtained using statistical and AI methods. • Prediction Assumption – Risk evaluation expertise embedded in historical rating data • Beyond Prediction – Interpretation of models Market characteristics Our Study • Apply a relatively new machine learning technique, Support Vector Machines, with a classic technique, Neural Networks • Interpretation of the model – Variable contribution analysis • Cross market analysis – United States and Taiwan market Credit Risk Analysis Credit Rating • Two types of ratings – Debt issue rating – bond rating, issue credit rating – Debt issuer rating – conterparty credit rating, default rating, issuer credit rating. • Significant implication for investment community – Interest yield of the debt issue – Investment regulation (“investment” level ratings) – Conveys information about the value of the firm Credit Rating Process • Typical process – Issuing company contacts rating agency requesting rating – Issuing company submits evaluation package – Rating agency form evaluation team – Evaluation team submits rating report – Rating committee makes final decision • Time and labor intensive • Emphasizes on subjective judgment of financial analyst and rating committee members Literature Review: Bond Rating Prediction Statistical Methods • Ordinary Least Squares (OLS) – Fisher 1959, Horrigan 1966, Pogue 1969, West 1970 • Multiple Discriminant Analysis (MDA) – Pinches and Mingo 1973,1975 • Logistic Regression Analysis – Ederington 1985 • Probit Analysis – Gentry 1988, Jackson • Prediction Accuracy: 50 – 70% • Frequently used financial variables – measures of size, financial leverage, long-term capital intensiveness, return on investment, short-term capital intensiveness, earnings stability and debt coverage stability Statistical Methods (cont.) • General Conclusion – A simple model with a small list of financial variables could classify about two-thirds of a holdout sample of bonds • Statistical Models – Succinct and easy to explain – Problem: Violation of multivariate normality assumptions for independent variables Artificial Intelligence Methods • Trade-off between explanatory power and interpretability of the models • Statistical methods – Simple model, under-fit the data • Artificial Intelligence methods – Increased model size (complexity of the models) – Higher prediction accuracy (possible data overfitting) – Difficult to interpret Artificial Intelligence Methods (cont.) • • • • Neural networks Rule-based systems Inductive Learning/Decision Trees Case-based reasoning system Artificial Intelligence Methods (cont.) S tudy D utta and S hekhar 1988 S ingleton and S urkan 1990 G arw aglia 1991 K im 1993 M oody and U tans 1995 B ond rating categories M ethod A ccuracy 2 (AA vs.nonAA) BP 83.30% 2 (Aaa vs.A1, A2 or A3) BP 88% 3 BP 84.90% 55.17% (B P ) 31.03% (R B S ) 6 B P ,R B S 16 BP S am ple size B enchm ark statistical m ethods US U S (B ell com panie s) 30/17 LinR (64.7% ) 126 M D A (39% ) U S SP 797 N /A LinR (36.21% ), M D A (36.20% ), D ata U S S&P 36.2% ,63.8% (5 classes), 85.2% (3 classes) U S S & P 110/58/60 LogR (43.10% ) N /A N /A Artificial Intelligence Methods (cont.) S tudy M aher and S en 1997 K w on etal. 1997 K w on and Lim 1998 C haveesuk et al.1999 S hin and H an 2001 B ond rating categories 6 5 5 6 5 M ethod A ccuracy 70% (7),66.67% BP (5) BP 71-73% (w ith O P P ),66-67% (w ith O P P ) (w ithoutO P P ) 59.9% (AC LS ), AC LS ,B P 72.7% 5% (B P ), ) 56. B P ,R B F, 38.3% (R B F), LVQ 36.7% (LVQ ) 75.5% (C B R ,G A com bined) 62.0% (C B R ) C B R ,G A 53-54% (ID 3) D ata S am ple size US M oody's 299 B enchm ark statistical m ethods LogR (61.66% ),M D A (58-61% ) K orean 126 M D A (58-62% ) K orean 60/126 60 (10 for each category) M D A (61.6% ) LogR (53.3% ) 3886 M D A (58.461.6% ) U S S&P K orean BP: Backpropagation Neural Networks, RBS: Rule-based System, ACLS: Analog Concept Learning System, RBF: Radial Basis Function, LVQ: Learning Vector Quantization, CBR: Case-based Reasoning, GA: Genetic Algorithm, MDA: Multiple Discriminant Analysis, LinR: Linear Regression, LogR: Logistic Regression, OPP: Ordinary Pairwise Partitioning. Sample size: Training/tuning/testing. Artificial Intelligence Methods (cont.) • General Conclusion – Neural networks have been the most frequently used method. – Neural networks outperformed conventional statistical methods and inductive learning methods. • Assessment of the accuracy of previous studies needs to be adjusted by number of prediction classes – 5-class prediction accuracy: 55 – 75% • Wide range of financial variables and sample sizes – Number of financial variables: 7 – 87 – Sample sizes: 47 - 3886 • United States market and Korean market Research Questions Research Questions • Explanatory power – Whether applying a relatively new machine learning techniques, Support Vector Machines, will improve the credit rating prediction accuracy? • Interpretability – Can we provide analysis to increase the interpretability of Artificial Intelligence methods and try to extract more information about the market characteristics from Artificial Intelligence models? – Can we use Artificial Intelligence models to compare the characteristics of different financial market? Analytical Methods Backpropagation Neural Network • Most frequently used and best-performance method in the literature • Different network architectures have been tried – Number of hidden layers, number of hidden nodes • Used a standard three-layer fully connected backpropagation neural network – Number of hidden nodes: (number of input nodes + number of output nodes)/2 Support Vector Machines • Introduced by Vapnik in 1995 • Based on Structural Risk Minimization principle from computational learning theory • SVM is positioned at the intersection of learning theory and practice – “it contains a large class of neural nets, radial basis function (RBF) nets, and polynomial classifiers as special cases. Yet it is simple enough to be analyzed mathematically, because it can be shown to correspond to a linear method in a high-dimensional feature space nonlinearly related to input space.” – Hearst 1998 Support Vector Machines (cont.) • A good candidate for combining the strengths of more theory-driven statistical methods and more data-driven machine learning methods • Empirical evidence – Excellent generalization performance in a wide range of problems (Bioinformatics, text categorization, image detection, etc.) • Has not been applied to the credit rating prediction problem • Multi-class SVM – Hsu and Lin 2002, BSVM package Data Sets Taiwan Data Set • Taiwan Ratings Corporation – Established in 1997, partnering with Standard & Poor’s. • Securities and Futures Institute – Quarter financial statement, financial ratios of publicly traded companies • Data Preparation – Used the credit rating and the company’s financial variables 2 quarters before the rating releasing date – 74 data points, 21 financial variables, 25 financial institutes, 1998-2002 United States Data Set • A comparable US data set from Standard & Poor’s Compustat – Comparable financial variables – S&P senior debt rating for all commercial banks (DNUM 6021) – 36 commercial banks, 265 data points, 1991-2000. TW data twAAA twAA twA twBBB twBB Total 8 11 31 23 1 74 US data AA A BBB BB B Total 20 181 56 7 1 265 Variable Selection • ANOVA test – Whether the differences of each financial variable among different rating classes were significant. – 5 uninformative variables removed from the data set • Final data sets – Taiwan: 14 financial ratios and 2 balance measures – United States: 12 financial ratios and 2 balance measures Financial Variables X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 Financial Ratio Name/ Description Total assets Total liabilities Long-term debts/ total invested capital Debt ratio Current ratio Times interest earned (EBIT/interest) Operating profit margin (Shareholders’ equity + long-term debt)/ fixed assets Quick ratio Return on total assets Return on equity Operating income/ received capitals Net income before tax/ received capitals Net profit margin Earnings per share Gross profit margin Non-operating income/ sales Net income before tax/ sales Cash flow from operating activities/ current liabilities (Cash flow from operating activities / (capital expenditures + increased in inventory + cash dividends)) in last 5 years (Cash flow from operating activities – cash dividends)/ (fixed assets + other assets + working capitals) ANOVA BetweenGroup P-Value 0 0 0.12 0 0.36 0 0 0 0.37 0.01 0.04 0 0 0 0 0.02 0.81 0 0.84 0.64 0.08 Experiment Results and Analysis Experiment Results • 4 Models (Frequently used variables, full set of variables) – TW I: Rating = f(X1,X2,X3,X4,X6,X7) – TW II: Rating = f(X1, X2, X3, X4, X6, X7, X8, X10, X11, X12, X13, X14, X15, X16, X18, X21) – US I: Rating = f(X1,X2,X3,X6,X7) – US II: Rating = f(X1, X2, X3, X6, X7, X8, X10, X11, X12, X13, X14, X15, X16, X21) Experiment Results (cont.) • Results – SVM did not outperform neural networks. – The small set of frequently used financial variables contained most relevant information. TW I TW II US I US II SVM Results 79.73% 77.03% 78.87% 80.00% NN Results 75.68% 75.68% 80.00% 79.25% Difference 4.05% 1.35% -1.13% 0.75% Experiment Results 81.00% 80.00% 79.00% 78.00% SVM Results 77.00% 76.00% 75.00% NN Results 74.00% 73.00% TW I TW II US I US II Within-1-class accuracy Predicted Rating Predicted Rating Acutal Rating twBB Acutal Rating twAAA twAA twA twBBB twAAA twAA twA twBBB twBB twAAA 7 0 1 0 0 twAAA 5 0 2 1 0 twAA 0 10 1 0 0 twAA 0 9 2 0 0 twA 4 1 23 3 0 twA 2 4 22 2 0 twBBB 1 0 6 16 0 twBBB 0 0 5 17 1 twBB 0 0 0 1 0 twBB 0 0 0 1 0 TW I: within-1-class accuracy: 91.89% TW II: within-1-class accuracy: 93.24% Predicted Rating Acutal Rating Predicted Rating AA A BBB BB B Acutal Rating AA 0 20 0 0 0 A 0 178 3 0 BBB 0 23 33 BB 0 2 B 0 0 AA A BBB BB B AA 6 13 1 0 0 0 A 2 165 12 2 0 0 0 BBB 0 16 37 2 1 5 0 0 BB 0 0 0 2 3 1 0 0 B 0 0 0 4 1 US I: within-1-class accuracy: 97.74% US II: within-1-class accuracy: 98.44% Variable Contribution Analysis • Research of credit rating prediction using Artificial Intelligence methods has been solely focused on prediction accuracy. • Low level understanding of the market – Credit rating analyst rate companies (consciously or unconsciously) based on a specific set of financial variables • Higher level understanding – What are the relative importance of individual financial variables in the process of credit rating? - Variable Contribution Analysis Variable Contribution Analysis (cont.) • Difficult for both Neural Networks and Support Vector Machines • Substantial literature in interpreting neural network models – Mainly extracts information from the connection strengths (inter-layer weights) of neural network model – Measures of relative importance – Garson 1991, Yoon 1994 – Symbolic rules derived from connection weights – Taha 1999 – Optimal neural network structure construction and better understanding of the models - Engelbrecht 1998 Measure of Relative Importance • First order derivatives of the network parameters – Neural network model <y1, y2, …, yn>=f(<x1,x2, …, xm>) – Contribution measure: yi / xj • Garson 1991 – Without direction Conik • Yoon 1994 – With direction • Conik relative contribution of input i on out k Connection strengths between input, hidden and output layers are denoted as w ji and v jk . | w ji || v jk | J j 1 I i 1 | w ji | | w ji || v jk | I J i 1 j 1 I i 1 w v w | w ji | J Conik j 1 ji I J i 1 j 1 jk ji v jk Variable Contribution Analysis • Garson’s measure • Optimal set of variables for the two markets – TW III: Rating = f(X1, X2, X3, X4, X6, X7, X8) – US III: Rating = f(X1, X2, X3, X4, X7, X11) Financial Variable Name/ Description X1 X2 X3 X4 X6 X7 X8 X11 Total assets Total liabilities Long-term debts/ total invested capital Debt ratio Times interest earned (EBIT/interest) Operating profit margin (Shareholders’ equity + long-term debt)/ fixed assets Return on equity Contribution Analysis Results Variable Contribution (United States) Variable Contribution (Taiw an) 0.3 0.3 AA 0.25 A 0.2 BBB 0.15 BB 0.1 B 0.05 Contribution Measure Contribution Measure 0.35 0.25 tw AAA 0.2 tw AA 0.15 tw A tw BBB 0.1 tw BB 0.05 0 0 X1 X2 X3 X4 X7 X11 X1 Financial Variable X2 X3 X4 X6 X7 X8 Financial Varilables Financial Variable Name/ Description X1 X2 X3 X4 X6 X7 X8 X11 Total assets Total liabilities Long-term debts/ total invested capital Debt ratio Times interest earned (EBIT/interest) Operating profit margin (Shareholders’ equity + long-term debt)/ fixed assets Return on equity Cross Market Analysis • US Model – X1, X2, X3, X7 | X4, X11 – Most important: total assets, total liabilities, long-term debts/total invested capital • TW Model – X4, X7, X8 | X1, X2, X3, X6 – Most important: operating profit margin, debt ratio Discussion and Future Directions Discussion • We need expertise from credit rating industry to evaluate and interpret the results – Some positive response: “Size is not (that) important in Taiwan.” – Dr. Soushan Wu • The reason for the prediction accuracy improvement over previous studies • The reason for SVM’s failure to improve Future Directions • Data mining + text mining – Add important financial variables from the text format annual report • Larger scale cross market analysis – Mainland China, Taiwan, Hong Kong and United States markets • Multidimensional financial data visualization and exploration