Data Organization and Analysis in Mortgage Insurance: The Implications of Dynamic Risk Characteristics Prepared for: CAS 2008 Seminar on Ratemaking March 17-18, 2008 Presented by: Tanya Havlicek Kyle Mrotek, FCAS Background: What is Mortgage Insurance (MI)? The Mortgage Guaranty Model Act of the NAIC defines mortgage insurance as insurance against financial loss by reason of nonpayment of principal, interest, or other sums agreed to be paid under the terms of any note or bond or other evidence of indebtedness secured by a mortgage, deed of trust, or other instrument constituting a lien or charge on real estate…(“Mortgage Guaranty Insurance Model Act” Model #630-1, NAIC Section 2) 2 Background: What is Mortgage Insurance? The Applies for Loan Borrower The Lender Selects MI The MI Makes the Loan Payment including MI fee The Servicer Forwards Interest Yield and MI Claim Checks Fannie / Freddie Investors 3 Reserving Considerations Generally accepted methodology of reserving for mortgage guaranty claim liabilities is to reserve for loans currently delinquent, both known and IBNR: Occurrence is at missed payment Mortgage guaranty insurance companies do not reserve for loans insured but not delinquent (i.e. loans current on payments do not have associated reserves) – (“Mortgage Guaranty Insurance Model Act” Model #630-1, NAIC Section 16) The cohort of insured loans currently delinquent changes monthly Many delinquent loans do not result in a claim Delinquency status (categorical classification of a mortgage’s overdue payment) is an established strong predictor of future losses The relationship between delinquency status and future losses can be stable or change over time 4 • changes in economic conditions (unemployment, home price appreciation) • changes in mortgage products (dist’n of ARM’s or high Loan-To-Value loans) • changes in foreclosure or claims mitigation procedures Reserving Considerations Commonly use a frequency-severity approach to estimate required reserves in mortgage insurance In general, the further a loan is in default, the more likely the loan will not cure (become current on payments) and potentially lead to a claim Delinquency status can be based on the number of monthly payments missed or how long the loan has been consecutively delinquent • One payment behind for 6 months (Delinquent 1 month vs 6 months since missed pmt) Other considerations for frequency factors include underwriting risk characteristics and macroeconomic variables: • Loan-To-Value • Borrower credit rating • Property geography • Home price appreciation • Market interest rates Severity component often conditioned on claim having occurred, thus the dynamic input variable delinquency status has no effect • this presentation focuses on the frequency component of losses 5 Reserving Considerations Goal: Develop frequency factors that predict the claim rate for the current cohort of delinquent loans in order to estimate required reserves • Select factors along specified risk dimensions (including delinquency status) The dynamic nature of loan delinquency status manifests itself in MI reserving in several aspects: – Determines the cohort of loans that currently need reserves – A loan’s active delinquency status is a strong predictor of future losses – Need to calculate historical conditional claim frequencies to derive selected reserving factors Establishing loss reserves conditioned on delinquency status presents particular data issues There is a need to collect, organize, warehouse, and analyze large data sets that contain loan level detail over monthly evaluation dates to measure the probability of claim conditioned on delinquency status 6 Delinquency Status Once a loan becomes delinquent, over time it can maintain the same status, become progressively more delinquent, or move back and forth between delinquency stages before eventually resolving into one of two fates: it may become current in payments and be considered cured, or it may remain in default and result in a claim • To derive historical indications, need to track each delinquency over consecutive monthly evaluations to its ultimate cure or claim Loans can cure but then become delinquent at a later date • Must be able to distinguish delinquency trips to calculate pure claim rates Ability to distinguish and quantify delinquency trips and subsequent fates for all delinquent loans every month historically and then aggregate along risk-characteristic dimensions to develop reserving factors requires data availability and storage over consecutive monthly evaluation dates, otherwise tracking capability lost in data uncertainties 7 Delinquency Status Consider the following example to better understand the data organization and analysis issues created by delinquency status Illustrative Example of One Loan's Status Over Time Eval Month Status 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 8 Current Delinquent-30 Delinquent-60 Current Delinquent-30 Current Current Delinquent-30 Delinquent-60 Delinquent-90 Delinquent-60 Delinquent-90 Delinquent-60 Delinquent-30 Current Delinquent-30 Delinquent-60 Delinquent-90 Delinquent-120 Current Delinquent-30 Delinquent-60 Delinquent-90 Delinquent-120 Claim Need Reserve? Delinquency Trip # No Yes Yes No Yes No No Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes NA NA 1 1 NA 2 NA NA 3 3 3 3 3 3 3 NA 4 4 4 4 NA 5 5 5 5 5 This loan’s delinquency status changes every month As delinquency status changes, the expected loss changes via frequency (or probability of claim) The expected frequency of claim for a “30” may be different in Eval Month 2 and Eval Month 16 • e.g. change in economic conditions Only on Trip 5 does this loan result in a claim (overall Claim Rate=1/5) To develop historical claim frequencies requires tracking each loan at each month to its ultimate resolution for that delinquency trip Data Organization Two types of loan characteristics an actuary may want to use as dimensions in developing reserving factors: dynamic and static Static characteristics: do not change over the lifetime of the loan • e.g. original loan-to-value, borrower original credit score – Static information can be stored in a database with one record for each policy – Updated and appended to as new policies originated Dynamic characteristics: can change monthly • e.g. delinquency status, current loan-to-value – Dynamic information should be stored in a database with a policy record for every month of the loan’s lifetime – Updated and appended to each successive month – Allows for accurate reconstruction and analysis of historical delinquency cohorts and their fates Size requirements and processing time considerations may make it infeasible or impractical to store all attributes of all loans at every month 9 Data Organization Consider database size necessary to store monthly records of 100 fields for 100,000 loans for 156 months (a single book year of business over its lifetime), but only 5 fields change over the life of the loan Repeat this to add 5 book years of business Single database will grow rapidly to an unmanageable size (PC environ) Suffices to have a unique primary key ID field in two databases (dynamic and static) so they can be merged one-to-many correctly Field typically used for this is policy certificate number assigned by mortgage guaranty insurer Some challenges that can be addressed with a two database organization: – – 10 Data purging occurs in a single policy database with key dates • If a loan cures but becomes delinquent again at a later date, information on prior delinquency trip lost as new information written for new delinquency trip • Dynamic database with every monthly evaluation date stores complete performance history Storage requirements potentially massive for single database with all characteristics of all loans every month Data Organization Time window of claims performance data – As much as is available • Data availability partly determines constraints of analysis – Minimum: enough consecutive months of complete data to observe credible amount of delinquency resolutions – Ideally, enough to: • capture an entire economic cycle • observe claims development for many delinquency cohorts (each policy can be inforce upwards of 15 years) • observe claims performance of all currently insured loan types (business mix) Time periods need to be contiguous 11 – Data “holes” where delinquency activity absent may make it impossible to determine fate of any loan actively delinquent to and leading into missing evaluation date – Delinquency cohorts of missing months cannot be used for analysis Data Organization Claim Rate 60.0% 50.0% 40.0% 30.0% Clm Rate 20.0% 10.0% Ja n99 Ju l-9 Ja 9 n00 Ju l-0 Ja 0 n01 Ju l-0 Ja 1 n02 Ju l-0 Ja 2 n03 Ju l-0 Ja 3 n04 Ju l-0 Ja 4 n05 Ju l-0 Ja 5 n06 Ju l-0 Ja 6 n07 0.0% Delinquency cohort April ‘03 is missing from the data Without adjusting the data, all outstanding delinquencies cure in April ‘03 by virtue of not being delinquent in that month (failure in data continuity) Can adjust using assumptions, but will not know exact history Note: Illustrative data for discussion purposes only 12 Data Organization Claim Rate 60.0% 50.0% 40.0% 30.0% Clm Rate 20.0% 10.0% Ja n99 Ju l-9 Ja 9 n00 Ju l-0 Ja 0 n01 Ju l-0 Ja 1 n02 Ju l-0 Ja 2 n03 Ju l-0 Ja 3 n04 Ju l-0 Ja 4 n05 Ju l-0 Ja 5 n06 Ju l-0 Ja 6 n07 0.0% Delinquency cohort April ‘03 is missing in the data (Note: Illustrative data for discussion purposes only) Here, the data “hole” is plugged by repeating March ’03 for April ’03 in tracking fates (in effect, bridging the data “hole”) Much better, but still not exact – use with caution! Decline in claim rate for recent time in part due to unresolved delinquencies 13 Data Organization Set of loans to warehouse in databases: All loans vs delinquent to date Delinquent to date requires considerably less storage space than all loans ever written to date Delinquent to date requires more assumptions, merging, and date logic to program and process If all loans ever written are included in the databases, there is no ambiguity associated with an omitted loan: it is a data error • either because of an accidental omission or because the loan did not belong to the mortgage guaranty company and was removed By storing only delinquent loans at each evaluation date, a loan may not appear for at least two reasons: the loan is no longer delinquent or there has been a data error Potential sources of ambiguity include delinquency status or in-force status at any given evaluation date Design decision of all loans vs delinquent to date loans should be based on the analytical requirements of the user and the hardware and software platforms that will support the data 14 Data Processing A B C Record # Evaluation Date Loan ID 1 Jan-06 1 2 Jan-06 2 3 Jan-06 3 4 Jan-06 4 5 Jan-06 5 6 Feb-06 1 7 Feb-06 2 8 Feb-06 3 9 Feb-06 4 10 Feb-06 5 11 Mar-06 1 12 Mar-06 2 13 Mar-06 3 14 Mar-06 4 15 Mar-06 5 16 Apr-06 1 17 Apr-06 2 18 Apr-06 3 19 Apr-06 4 20 Apr-06 5 21 May-06 1 22 May-06 2 23 May-06 3 24 May-06 4 25 May-06 5 26 Jun-06 1 27 Jun-06 2 28 Jun-06 3 29 Jun-06 4 30 Jun-06 5 D Status* E Delq F Delq Trip G Cure 0 30 90 60 0 0 30 90 0 30 30 30 120 30 60 0 30 FCL 60 30 30 30 FCL FCL 0 0 30 CLM CLM 0 * 0 = Current; 30, 60, 90, 120 = payment days past due; FCL = foreclosure; CLM = claim 15 H Claim Table1: Five loans over six delinquency months Six delinquency cohorts Goal: determine fate of each loan for every month it is delinquent while keeping track of delinquency trips, so claim ratios can be calculated for each cohort DelqTrip: If a loan cures, it no longer needs a reserve If a loan cures and later re-delq’s and claims, first trip still gets credit for cure MI’s do not reserve for ultimate claims for all currently insured loans Data Processing Delinquency fates are determined by looking forward in time from each evaluation month to determine the resolution of each delinquency Once delinquency fates are determined, the empirical conditional probability of claim for each monthly cohort and each delinquency status can be calculated via aggregation Tallies are summed by delinquency cohort and risk characteristics Consider Loan ID 3 from previous Table A B Record # Evaluation Date 3 Jan-06 8 Feb-06 13 Mar-06 18 Apr-06 23 May-06 28 Jun-06 C Loan ID 3 3 3 3 3 3 D Status* 90 90 120 FCL FCL CLM E F Delq Delq Trip 1 1 1 1 1 1 1 1 1 1 0 1 G Cure 0 0 0 0 0 0 * 0 = Current; 30, 60, 90, 120 = payment days past due; FCL = foreclosure; CLM = claim Loan ID 3 claims in Jun-06 and backfills as a claim (col H) for all evaluation dates 16 H Claim 1 1 1 1 1 1 Data Processing Alternatively, consider Loan ID 4 A B Record # Evaluation Date 4 Jan-06 9 Feb-06 14 Mar-06 19 Apr-06 24 May-06 29 Jun-06 C Loan ID 4 4 4 4 4 4 D Status* 60 0 30 60 FCL CLM E F Delq Delq Trip 1 1 0 NA 1 2 1 2 1 2 0 2 G Cure 1 NA 0 0 0 0 H Claim 0 NA 1 1 1 1 * 0 = Current; 30, 60, 90, 120 = payment days past due; FCL = foreclosure; CLM = claim There are two delinquency trips for Loan ID 4: one that began prior to Jan-06 and a second that begins in Mar-06 Loan ID 4 backfills as a cure for Jan-06 (col G) for its first delinquency trip and backfills as a claim (col H) for its second delinquency trip Cannot just look at final ultimate status to determine tallies MI companies reserve only for losses related to current delinquencies that will not cure before leading to the insurance loss Contiguous history key to determining resolutions and tallies 17 Data Processing A B Record # Evaluation Date 1 Jan-06 2 Jan-06 3 Jan-06 4 Jan-06 5 Jan-06 6 Feb-06 7 Feb-06 8 Feb-06 9 Feb-06 10 Feb-06 11 Mar-06 12 Mar-06 13 Mar-06 14 Mar-06 15 Mar-06 16 Apr-06 17 Apr-06 18 Apr-06 19 Apr-06 20 Apr-06 21 May-06 22 May-06 23 May-06 24 May-06 25 May-06 26 Jun-06 27 Jun-06 28 Jun-06 29 Jun-06 30 Jun-06 C Loan ID 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 D Status* 0 30 90 60 0 0 30 90 0 30 30 30 120 30 60 0 30 FCL 60 30 30 30 FCL FCL 0 0 30 CLM CLM 0 E F Delq Delq Trip 0 NA 1 1 1 1 1 1 0 NA 0 NA 1 1 1 1 0 NA 1 1 1 1 1 1 1 1 1 2 1 1 0 NA 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 2 0 NA 0 NA 1 1 0 1 0 2 0 NA G Cure NA 0 0 1 NA NA 0 0 NA 1 1 0 0 0 1 NA 0 0 0 1 1 0 0 0 NA NA 0 0 0 NA * 0 = Current; 30, 60, 90, 120 = payment days past due; FCL = foreclosure; CLM = claim 18 H Claim NA 0 1 0 NA NA 0 1 NA 0 0 0 1 1 0 NA 0 1 1 0 0 0 1 1 NA NA 0 1 1 NA Table 1 with complete fate tallies Observe Loan ID 2 is unresolved as of Jun-06, so contributes neither as cure nor claim Loan ID 2 remains one payment behind for the six month time period Data Processing In practice, there are not 30 records for five loans to analyze but potentially millions of records for hundreds of thousands of loans At the end of 2006, the private mortgage industry had nearly $800 billion of primary insurance in force Use a programming language that can handle do-loops and consecutive record comparison, so that key ID fields, delinquency statuses, and evaluation dates can be compared and processed • 19 C++, Visual Basic Delinquency Analysis Once fates are tallied at the loan level, they can be aggregated and analyzed for each delinquency cohort and delinquency status Consider delinquency cohort Mar-06 from the previous example Use only resolved delinquencies to calculate ever-to-date empirical claim rate Can make assumptions about claim rate on unresolved delinquencies for projections • ultimate claim rate • maximum claim rate A B C D E Delinquency Cohort Status* Delqs Cures Claims Mar-06 30 3 1 1 Mar-06 60 1 1 0 Mar-06 90 0 0 0 Mar-06 120 1 0 1 Mar-06 FCL 0 0 0 30, 60, 90, 120 = payment days past due; FCL = foreclosure 20 F = D+E G = E/F Resolved Claim Delqs Rate 2 50% 1 0% 0 NA 1 100% 0 NA Delinquency Analysis Once fates are tallied at the loan level, they can be aggregated and analyzed along various risk dimensions after loan characteristics are merged on from the static database – can assess risk interactions Risk dimensions selected depend on data availability and actuary’s judgment on predictive value and credibility (homogeneity vs data thinning) A B C D E Status* Loan-To-Value Delqs Cures Claims 30 90 1000 930 70 95 1200 1092 108 100 1400 1232 168 60 90 800 720 80 95 900 792 108 100 1000 860 140 90 90 600 528 72 95 700 595 105 100 800 664 136 120 90 300 240 60 95 350 266 84 100 400 288 112 FCL 90 100 65 35 95 120 72 48 100 140 77 63 *30, 60, 90, 120 = payment days past due; FCL = foreclosure 21 F = D+E Resolved Delqs 1000 1200 1400 800 900 1000 600 700 800 300 350 400 100 120 140 G = E/F Claim Rate 7% 9% 12% 10% 12% 14% 12% 15% 17% 20% 24% 28% 35% 40% 45% Incorporating Economic Variables Economic variables can be incorporated into the data organization and analysis similar to underwriting variables and delinquency status Economic variables found to be predictive include: – – – 22 Home price appreciation • Historical data available from Office of Federal Housing Enterprise Oversight (OFHEO)-quarterly home price indices by MSA since mid 1970s • Annual and quarterly forecasts available for purchase from Global Insight and Moody’s Economy.com Market interest rate • Historical data available from Freddie Mac-monthly interest rates by geographic regions • Actuary can calibrate interest rate model for simulations, or interest rate forecasts available from various sources including Mortgage Bankers Association Unemployment rate • Historical data available from Bureau of Labor Statistics • Forecasts of unemployment available for purchase Reserve Factor Analysis Claim rate frequency indications can be calculated using summary statistics of the actuary’s choice by using different groupings of delinquency cohorts From these indications, along with other sources for consideration, the actuary can select frequency factors to be applied to the current, and potentially future, cohort of delinquent loans for loss reserving purposes Severity factors can be derived by delinquency cohort using the output from the tally processing at the loan level in conjunction with exposure and claim loss dollar information 23 Conclusions Mortgage guaranty loss reserves are provisions for losses due to insured loans currently delinquent, both reported and unreported Specifically, there need not be a provision for losses due to loans insured but not delinquent As a result, the status of whether a loan is delinquent or not is integral to the reserve estimate The degree of a loan’s delinquency is a significant predictor of loan default and insured loss A loan’s delinquency status can change monthly These issues combined create a need for the reserving actuary to have a contiguous historical performance data warehouse that is usually best organized into two databases: dynamic and static A two database organization addresses computer processing and storage considerations and constraints The ability to reconstruct accurate historical claim rates requires monthly database updating, relational database fields with integrity, and maintenance without purging data 24 Acknowledgments This presentation is based on “Data Organization and Analysis in Mortgage Insurance: The Implications of Dynamic Risk Characteristics” Special thanks to Paul Keuler, whose advice and collaboration was critical in creating and developing the analysis methods presented herein Special thanks to Dave Hudson and the Committee on Management of Data and Information for the opportunity to submit our paper and their valuable feedback Special thanks to the Committee on The Ratemaking Seminar for inviting us to present and offering a venue 25 References “Mortgage Guaranty Insurance Model Act” Model #630-1, National Association of Insurance Commisioners, Section 16. DeFranco, Ralph, “Modeling Residential Mortgage Termination and Severity Using Loan Level Data”, Spring 2002. “Mortgage Guaranty Insurance Model Act” Model #630-1, National Association of Insurance Commisioners, Section 2. “Mortgage Guaranty Insurance Model Act” Model #630-1, National Association of Insurance Commisioners, Section 9. Siegel, Jay, “Moody’s Mortgage Metrics: A Model Analysis of Residential Mortgage Pools”, April 1, 2003. Inside Mortgage Finance, Feb 16, 2007. 26