Analytics Business in Financial Analytics Services Tap into the true value of analytics Organize, analyze, and apply data to compete decisively Content Preface From the Editors’ Desk Analytics for a New Decade 01. Post-Crisis Analytics: Six Imperatives 05 02. Structuring the Unstructured Data: The Convergence of Structured and Unstructured Analytics 13 Revitalize Risk Management 03. Fusing Economic Forecasts with Credit Risk Analysis 21 04. Unstructured Data Analytics for Enterprise Resilience 29 05. Why Real-Time Risk Decisions Require Transaction Analytics 37 Optimize to Drive Profits 06. Ten Questions to Ask of Your Optimization Solution 47 07. Practical Challenges of Portfolio Optimization 55 Understand Your Customer 08. Analytics in Cross Selling – A Retail Banking Perspective 61 09. Analytics as a Solution for Attrition 69 10. Customer Spend Analysis: Unlocking the True Value of a Transaction 77 0 11. A Dynamic 360 Dashboard: A Solution for Comprehensive 85 Customer Understanding Fight Fraud More Effectively 12. Developing a Smarter Solution for Card Fraud Protection 93 13. Using Adaptive Analytics to Combat New Fraud Schemes 103 14. To Fight Fraud, Connecting Decisions is a Must 109 Improve Model Performance 15. Productizing Analytic Innovation: The Quest for Quality, 117 Standardization and Technology Governance Leverage Analytics Across Lines of Business 16. Analytics in Retail Banking: Why and How? 125 17. Business Analytics in the Wealth Management Space 135 Analytics in Financial Services 10 Customer Spend Analysis: Unlocking the True Value of a Transaction Vinay Prasad Principal Architect, Banking and Capital Markets Practice, Infosys Technologies Limited Financial institutions have compiled a wealth of customer transaction data over the years. When properly analyzed, such data can unlock a treasure trove of predictive information— when the customer will spend, where such spending will occur, and how much will be spent. This article analyzes spend events, the techniques to identify spend events, and the process of utilizing spend patterns to predict customer spending behavior. Introduction To conduct such an elaborate analysis, a firm must know: Transactional information stored in a financial institution is embedded with information that forms the basis of a spend analysis. Moving into the second decade of the 21st century, a key imperative for banks is to extract this information and convert it into actionable insights. Imagine the capability to predict: 1. What was purchased n · When the customer will take his/ her next vacation 2. When was it purchased n · When the customer will eat out, where he/ she will go, and how much he/ she will spend n · When the customer will spend at the mall and at what stores It is no longer enough to know how much the customer is likely to spend. Marketing managers globally now want to know the “when”, “where” and “what”—details which make the analysis much more useful. One crucial piece of information missing in transactional data is the details of the specific goods or services purchased. As a proxy, the merchant type can be used to determine the kind of goods or services the client has purchased. Here the transaction date and time is important, as it will help in identifying the sequence of events. This will also help in identifying the frequency of spend on a particular type of good or service. 3. Who made the purchase Was the transaction conducted by the main account holder or by one of the dependents? This can help gather some demographic information on the actual purchaser. 4. Where was the purchase made The Geo-code of the merchant will be of help in identifying the location of the purchase (unless the purchase was an ecommerce transaction). 5. How much was spent one cannot group customers purely based on demographics with adequate confidence. Spend analytics in this article will focus on analysis using Model 2, working from spend data available mainly in the form of credit card transactions. This information is stated on the transaction. Using this information, one can take a number of approaches to define a predictive model for customer-spend analytics. The choice of the appropriate model will be based on various conditions, specific to the case. Two such models are highlighted below. Model 1: Pre-defined Segmentation Customer Customers are grouped based on certain demographic data, with an assumption that people of similar demographic backgrounds are expected to behave in a consistent way. The credit card transaction of a customer in a particular group is analyzed to derive a pattern. Here, the bank looks for similar transactions done within a predefined period by a group of customers. Once a pattern is recognized, any new customer who falls within the group is expected to behave in the same fashion. Model 2: Customer Behavior-based Segmentation Transactions are analyzed to bring out similar behavior that has occurred at least a certain number of times across the customer population. Once such behavior is identified, the subset of customers exhibiting such behavior is analyzed against the rest of the customer base to bring out the discriminating factors. Any new customer exhibiting these discriminating factors is then expected to behave as per the identified behavior pattern. This approach is to be used when the customer behavior pattern is very secular and 78 What is a Spend Event? A spend event is a set of transactions a customer makes to fulfill a need. Customers often make a similar sequence of transactions if they have the same need. For example, consider a customer who has the need to go on a vacation: a) The customer may book his/ her itinerary well in advance (say x days as this can vary). b) On the day the vacation begins, the customer spends on a taxi (in this case it is paid by card). c) The customer checks in at the airport, maybe using the credit card. d) The customer rents a vehicle, requiring a credit card swipe. e) The customer checks in at the hotel and swipes the card. f) The customer dines out more frequently during the vacation, swiping his card at various restaurants. All of these transactions would have occurred in a cluster on the time axis, and the time span across the transactions in the cluster would be fairly consistent across the client base. To start the analysis, the initial question that needs to be answered is the time period across which the data should be analyzed—it could be one statement month, multiple statement months, or any other time window based on the hypothesis being tested. The next question to be answered considers the granularity of the data. The level of granularity will depend on the target audience of the analysis. For example, while analyzing the data, a bank may not be interested in expenses related to restaurants. This may lead to aggregation of all restaurantrelated expenses during the day as one expense. Overall, aggregation reduces the number of records to be analyzed—removing unwanted details. Each transaction can be classified by the merchant's industry. Hence, the transactions can be associated with the type of goods/ services purchased at a broader level. The customer-spend across months (statements), classified by the type of goods and services purchased, leads to the critical “when”, “where”, and “what” information discussed in previous sections: 1. What is the regular set of products and services a customer spends on 2. Where does the customer usually spend on these goods/ services 3. How much does the customer spend on any of these goods/ services 4. Who makes the purchase of specific goods/ services—the primary card-holder or the dependent 5. When does the customer make such purchases—not just the time of day, but one type of good/ service spend event in relation to another type of good/ service spend events Identifying Spend Events Identifying spend events involves the clustering of transactions, using a singledimensional distance measure (based on time gap) or multi-dimensional distance measure (based on time gap and other attributes like amount spent). Once the transactions are clustered, they may be converted into a spend event by capturing the different types of goods/ services bought (based on merchant type), place and time of transaction (if transactions are not aggregated across merchants in a merchant type), time span across all transactions, transaction amount, and merchant details. For the purpose of Model 2 (customer-behaviorbased segmentation), the spend event defines a building block for the construction of a spend pattern across a larger time frame. Spend events can be classified as irregular or regular, based on the recurrence across time windows being considered in an analysis (goods/ services may also be used as a proxy for identifying regular and irregular spend events, though one has to be careful, as eating out at a restaurant for one customer may be a regular spend, whereas for another, it may be an irregular one). The identification of the spend event is based on a number of criteria that the bank has to determine based on the nature of the analysis. For example, if the target is to identify irregular spend events from the data in Figure 1 (on the next page) the following factors can be used: a) Number of transactions per day b) Location of transactions c) Day of the week Based on these criteria, the bank is able to identify that: a) The regular spending location is the NY/ NJ metropolitan area b) The number of transactions on a regular work day can range from 2 to 3 Hence, an irregular spend event (marked in blue) is characterized by: a) Number of transactions increased to 8 on 6/13 79 An excerpt from a card statement highlighting an irregular spend event 80 Posted Date Payee Address 6/11/2009 ORB MADAQT ORBITZ.COM IL ORBITZ.COM IL 6/13/2009 SEARS ROEBUCK 1684 WOODBRIDGE NJ 6/13/2009 Figure 1 Amount Day of week -107.07 Thursday WOODBRIDGE NJ -24.99 Saturday MAID OF THE MIST STORE NIAGARA FALLSNY NIAGARA FALL NY -11.09 Saturday 6/13/2009 NIAGARA PK CAVE OF WIN NIAGARA FALLSNY NIAGARA FALL NY -14.58 Saturday 6/13/2009 NIAGARA PK CAVE OF WIN NIAGARA FALLSNY NIAGARA FALL NY -92 Saturday 6/13/2009 PETRO #371 WATERLOO WATERLOO NY WATERLOO NY -27.81 Saturday 6/13/2009 KOHINOOR INDIAN RESTUR 716-284-2414 NY 716-284-2414 NY -35 Saturday 6/13/2009 COMFORT INN OF BINGHAM BINGHAMTON NY BINGHAMTON NY -135.55 Saturday 6/13/2009 HKK SUPER SERVICE FLANDERS NJ FLANDERS NJ -21.13 Saturday 6/14/2009 EXXONMOBIL 97360424 WEST HENRIETTNY WEST HENRIET NY -15.91 Sunday 6/14/2009 HOLIDAY INN GRAND HOTEL GRAND ISLAND NY GRAND ISLAND NY -136.98 Sunday 6/14/2009 PILOT 00001701 BINGHAMTON NY BINGHAMTON NY -13.79 Sunday 6/14/2009 DNC SCOTTSVILLE TRVL F W. HENRIETTA NY W. HENRIETTA NY -12.01 Sunday 6/15/2009 ZAHRAS CAFE AND BAKE JERSEY CITY NJ JERSEY CITY NJ -7.44 Monday 6/15/2009 PATHTVM NEWARK BM BW 212-METROCARDNY 212-METROCAR NY -54 Monday 6/15/2009 BLIMPIES JERSEY CITY NJ JERSEY CITY NJ -6.51 Monday 6/16/2009 RELIANCE COMMUNICATION 888-673-5426 NY 888-673-5426 NY -33.58 6/17/2009 TOYS R US #6318 ISELIN NJ ISELIN NJ -27.76 6/17/2009 WEGMANS #032 WOODBRIDGE NJ WOODBRIDGE NJ -32.83 Tuesday Wednesday Wednesday 6/18/2009 TASTE OF INDIA JERSEY CITY NJ JERSEY CITY NJ -2.14 Thursday 6/18/2009 TASTE OF INDIA JERSEY CITY NJ JERSEY CITY NJ -6.37 Thursday 6/20/2009 TASTE OF INDIA JERSEY CITY NJ JERSEY CITY NJ -7.44 Thursday Excerpt from a card statement highlighting a shift in regular spending habits against data in figure 1 Figure 2 Posted Date Payee 9/8/2009 SUBZI MANDI 9/11/2009 NJT LIBERTY ST.DLY TV7 JERSEY CITY NJ JERSEY CITY NJ 9/11/2009 WEGMANS #032 WOODBRIDGE NJ -33.34 9/11/2009 HESS 30215 WOODBRIDGE NJ -25.86 NETFLIX.COM CA -18.18 9/11/2009 Address ISELIN NJ ISELIN NJ WOODBRIDGE NJ WOODBRIDGE NJ NFI*WWW.NETFLIX.COM/CC NETFLIX.COM CA Amount -50.52 -6.8 JERSEY CITY NJ -7.44 NJT LIBERTY ST.DLY TV7 JERSEY CITY NJ JERSEY CITY NJ -3 9/12/2009 WEGMANS #032 WOODBRIDGE NJ WOODBRIDGE NJ -18.87 9/12/2009 WEGMANS #032 WOODBRIDGE NJ WOODBRIDGE NJ -51.38 9/12/2009 NEW JERSEY E-ZPASS 9/14/2009 TASTE OF INDIA 9/14/2009 9/12/2009 TASTE OF INDIA 9/12/2009 JERSEY CITY NJ 888-288-6865 NJ 888-288-6865 NJ -25 JERSEY CITY NJ -8.69 NJT LIBERTY ST.DLY TV7 JERSEY CITY NJ JERSEY CITY NJ -3 9/14/2009 USPS 33382504929213949 ISELIN ISELIN NJ -12.95 9/14/2009 BHAVANI CASH & CARRY ISELIN ISELIN NJ -28.01 9/14/2009 WEGMANS #032 WOODBRIDGE NJ -14.98 9/15/2009 TOYS R US #6318 ISELIN NJ ISELIN NJ -80 9/15/2009 TOYS R US #6318 ISELIN NJ ISELIN NJ -21.39 9/16/2009 NJT LIBERTY ST.DLY TV7 JERSEY CITY NJ JERSEY CITY NJ -16.25 9/16/2009 NJT LIBERTY ST.DLY TV7 JERSEY CITY NJ JERSEY CITY NJ -3 JERSEY CITY NJ NJ NJ WOODBRIDGE NJ 81 b) Location around Niagara Falls, NY c) Preceded by a booking on a travel site (orbitz.com) which was done 2 days in advance. Note: The classification of a spend event as an irregular/ regular spend event is based on the time span across which the data is analyzed. The same spend event may be classified as “regular” if the time span covers multiple years where every summer there are such regular weekend trips. Similarly, if the objective is to identify a shift in regular spend events (marked in gray across Figure 1 and Figure 2 on the previous two pages), the bank would look at the following criteria: a) Merchant segment – In this case, focus on transportation for results b) Location – Being same c) Average spend – Look for a significant change d) Number of transactions – Look for a significant change Translating Spend Events into Spend Patterns A spend pattern is defined as a sequence of spend events observed across multiple customers, thus outlining the following: a) Sequence of occurrence of such events b) Time period between two events Spend events across the transaction history of a customer are taken to form a spend sequence which is associated with the age or other demographic data obtained from the customer's records. Spend sequences across customers go through a discriminate analysis to identify factors that identify customer segments with similar spend patterns. The customer segment will need to be updated on a regular basis to get a better picture of the customer spending pattern. 82 Tapping into the Predictive Powers of a Spend Pattern A customer may be missing a couple of spend events here and there, but generally, all clients belonging to a spend pattern should have the same general sequence of events, and the time gap between the events should be more or less the same. To identify a spend pattern, it is recommended that the bank define an error limit for the time gap, so that two sequences can be considered similar. If the difference between related time gaps across two sequences is within the error limit, then they are considered to be part of the same spend pattern. If each spend event is denoted by a “letter”, a spend sequence can be thought of as a “word”. To identify if two such sequences are part of a pattern, the bank would have to use a sequence alignment algorithm—such as the Needleman/ Wunsch technique. Here, the user will have to define the weights to be associated with the match and mismatch of residues and also with gaps in the sequence. This will finally lead to a score for the alignment between the two sequences. The user can also define a limit on the score between two sequences, for the two to be a part of one pattern. A set of life-cycle events is denoted in Figure 3 and Figure 4 (on the next page). For example, Figure 4 shows the sequence of spend events across multiple customers over a period of time. There are two patterns — FIC and HG — in the data highlighted in Figure 4. FIC, as a pattern, indicates increase in disposable income and hence, Customers 1 and 3 may be more attractive to financial services and lifestyle firms. HG, as a pattern, indicates readiness for healthcare products. Notions to be used in a spend pattern for individual spend events Figure 3 Spend Event Denoted By International Vacation I Domestic Vacation D Drop in Payments to Financial Institutions F Increased Transaction on a Dependent Card C Medical Expenses(Hospital Payments) H Expenses Related to Gym G Example of spend patterns Figure 4 Customer Spend Pattern Customer 1 HFICG Customer 2 HG Customer 3 FICD Privacy Needless to say, the above analysis can be seen as an invasion on customer privacy, and to avoid any breach to customer privacy, the bank should take care of the following: a) Create an inability to link the transactional and demographic data back to customer identity information. b) Intrinsically, a repetition of a given sequence of events is required to form a pattern, based on which the discriminate analysis would provide the demographic information—leading to categorization of customers. Customers in a category would be treated similarly, hence shielding individual spending habits. c) Care needs to be taken in disposing of the intermediate data created during analysis, as it contains customer specific patterns (though if data is devoid of customer identity information, linking the two becomes extremely hard). Conclusion Transactional data stored within financial services firms provides a wealth of information that can be used to better integrate the customer into the financial services firm and the business ecosystem. The information extracted can provide goods/ 83 service providers powerful insights into customer behavior—driving improved targeted marketing efforts. Care should be taken while doing such analysis to safeguard the customer identity 84 data, never allowing it to be linked to the analysis process. The information gathered from this analysis process should be linked to demographic segmentation for further marketing actions.