Here

+ Adaptive Fraud Detection by Tom Fawcett and Foster Provost Presented by: Lara Nargozian updated from last 3 year’s presentation by Adam Boyer, Yunfei Zhao and Ahmen Abdeen Hamed + 2 Why?  Solving real-world problems that are very important to each and everyone of us  Provide a framework that can be adapted to solve similar problems  Use Data Mining algorithms and techniques learned this semester   Rule Learning  Classification Fun to learn about + 3 Outline  Problem Description  Cellular cloning fraud problem  Why it is important  Current strategies  Construction of Fraud Detector  Framework  Rule learning, Monitor construction, Evidence combination  Experiments and Evaluation  Data used in this study  Data preprocessing  Comparative results  Conclusion  Exam Questions + 4 The Problem  How to detect suspicious changes in user behavior to identify and prevent cellular fraud  Non-legitimate users, aka bandits, gain illicit access to a legitimate user’s, or victim’s, account  Solution  useful in other contexts Identifying and preventing credit card fraud, toll fraud, and computer intrusion + 5 Cellular Fraud - Cloning Cloning Fraud   A kind of Superimposition fraud.(parasite)  Fraudulent usage is superimposed upon ( added to ) the legitimate usage of an account.  Causes inconvenience to customers and great expense to cellular service providers. + 6 Cellular communications and Cloning Fraud  Mobile Identification Number (MIN) and Electronic Serial Number (ESN)  Identify a specific account  Periodically transmitted unencrypted whenever phone is on  Cloning occurs when a customer’s MIN and ESN are programmed into a cellular phone not belonging to the customer  Bandit can make virtually unlimited, untraceable calls at someone else’s expense + Interest in reducing Cloning Fraud  Fraud is detrimental in several ways:  Fraudulent usage congests cell sites  Fraud incurs land-line usage charges  Cellular carriers must pay costs to other carriers for usage outside the home territory  Crediting process is costly to carrier and inconvenient to the customer 7 + 8 Strategies for dealing with cloning fraud  Pre-call Methods  Identify and block fraudulent calls as they are made  Validate the phone or its user when a call is placed  Post-call Methods  Identify fraud that has already occurred on an account so that further fraudulent usage can be blocked  Periodically analyze call data on each account to determine whether fraud has occurred. + 9 Pre-call Methods  Personal  PIN cracking is possible with more sophisticated equipment.  RF  Identification Number (PIN) Fingerprinting Method of identifying phones by their unique transmission characteristics  Authentication    Reliable and secure private key encryption method. Requires special hardware capability An estimated 30 million non-authenticatable phones are in use in the US alone (in 1997) + 10 Post-call Methods  Collision  Analyze call data for temporally overlapping calls  Velocity  Detection Checking Analyze the locations and times of consecutive calls  Disadvantage  of the above methods Usefulness depends upon a moderate level of legitimate activity + 11 Another Post-call Method ( Main focus of this paper )  User Profiling  Analyze calling behavior to detect usage anomalies suggestive of fraud  Works well with low-usage customers  Good complement to collision and velocity checking because it covers cases the others might miss 12 Sample Frauded Account Date Time Day Duration 1/01/95 10:05:01 Mon 1/05/95 14:53:27 Origin Destination Fraud 13 minutes Brooklyn, NY Stamford, CT Fri 5 minutes Brooklyn, NY Greenwich, CT 1/08/95 09:42:01 Mon 3 minutes Bronx, NY Manhattan, NY 1/08/95 15:01:24 Mon 9 minutes Brooklyn, NY Brooklyn, NY 1/09/95 15:06:09 Tue 5 minutes Manhattan, NY Stamford, CT 1/09/95 16:28:50 Tue 53 seconds Brooklyn, NY Brooklyn, NY 1/10/95 01:45:36 Wed 35 seconds Boston, MA Chelsea, MA Bandit 1/10/95 01:46:29 Wed 34 seconds Boston, MA Yonkers, NY Bandit 1/10/95 01:50:54 Wed 39 seconds Boston, MA Chelsea, MA Bandit 1/10/95 11:23:28 Wed 24 seconds Brooklyn, NY Congers, NY 1/11/95 22:00:28 Thu 37 seconds Boston, MA Boston, MA Bandit 1/11/95 22:04:01 Thu 37 seconds Boston, MA Boston, MA Bandit + 13 The Need to be Adaptive  Patterns of fraud are dynamic – bandits constantly change their strategies in response to new detection techniques  Levels of fraud can change dramatically from month-to-month  Cost of missing fraud or dealing with false alarms change with inter-carrier contracts + Automatic Construction of Profiling Fraud Detectors + 15 One Approach  Build a fraud detection system by classifying calls as being fraudulent or legitimate  However there are two problems that make simple classification techniques infeasible. + 16 Problems with simple classification  Context  A call that would be unusual for one customer may be typical for another customer (For example, a call placed from Brooklyn is not unusual for a subscriber who lives there, but might be very strange for a Boston subscriber. )  Granularity  (over fitting?) At the level of the individual call, the variation in calling behavior is large, even for a particular user. + 17 In Summary: Learning The Problem 1. Which phone call features are important? 2. How should profiles be created? 3. When should alarms be raised? + Proposed Detector Constructor Framework (DC-1) 18 + 19 DC-1 Processing Account-Day Example + 20 DC-1 Fraud Detection Stages Stage 1: Rule Learning Stage 2: Profile Monitoring Stage 3: Combining Evidence + 21 Rule Learning – the 1st stage  Rule  Rules are generated locally based on differences between fraudulent and normal behavior for each account  Rule  Generation Selection Then they are combined in a rule selection step + 22 Rule Generation  DC-1 uses the RL program to generate rules with certainty factors above user-defined threshold  For each Account, RL generates a “local” set of rules describing the fraud on that account.  Example: (Time-of-Day = Night) AND (Location = Bronx)  FRAUD Certainty Factor = 0.89 + 23 Rule Selection  Rule generation step typically yields tens of thousands of rules  If a rule is found in ( or covers ) many accounts then it is probably worth using  Selection algorithm identifies a small set of general rules that cover the accounts  Resulting set of rules is used to construct specific monitors + Rule Selection and Covering Algorithm 24 + 26 Profiling Monitors – the 2nd stage Monitors have 2 distinct steps  Profiling   Monitor is applied to an account’s normal usage to measure the account‘s normal activity. Statistics are saved with the account.  Use    step: step: A monitor processes a single account-day, References the normalcy measure from profiling Generates a numeric value describing how abnormal the current account-day is. + 27 Most Common Monitor Templates Threshold Standard Deviation + 28 Threshold Monitors + 29 Standard Deviation Monitors + 30 Comparing the same standard deviation monitor on two accounts + 31 Example for Standard Deviation  Rule  (TIMEOFDAY = NIGHT) AND (LOCATION = BRONX) FRAUD  Profiling  the subscriber called from the Bronx an average of 5 minutes per night with a standard deviation of 2 minutes. At the end of the Profiling step, the monitor would store the values (5,2) with that account.  Use  Step step if the monitor processed a day containing 3 minutes of airtime from the Bronx at night, the monitor would emit a zero; if the monitor saw 15 minutes, it would emit (15 - 5)/2 = 5. This value denotes that the account is five standard deviations above its average (profiled) usage level. + 32 Combining Evidence from the Monitors – the 3rd stage  Weights the monitor outputs and learns a threshold on the sum to produce high confidence alarms  DC-1 uses Linear Threshold Unit (LTU)   Simple and fast Enables good first-order judgment A    Feature selection process is used to Choose a small set of useful monitors in the final detector Some rules don’t perform well when used in monitors, some overlap Forward selection process chooses set of useful monitors + 33 Final Output of DC-1  Detector that profiles each user’s behavior based on several indicators  An alarm when sufficient evidence of fraudulent activity + Data used in the study Data Information +  Four months of phone call records from the New York City area.  Each call is described by 31 original attributes  Some derived attributes are added  Time-Of-Day (MORNING, AFTERNOON, TWILIGHT, EVENING, NIGHT)  To-Payphone  Each call is given a class label of fraudulent or legitimate. 35 + 36 Data Cleaning  Eliminated credited calls made to destinations/numbers that are not in the created block  The destination number must be only called by the legitimate user.  Days with 1-4 minutes of fraudulent usage were discarded.  May have credited for other reasons, such as wrong number  Call times were normalized to Greenwich Mean Time for chronological sorting + 37 Data Description  Once the monitors are created and accounts profiled, the system transforms raw call data into a series of account-days using the monitor outputs as features  Selected      for Profiling, training and testing: 3600 accounts that have at least 30 fraud-free days of usage before any fraudulent usage. Initial 30 days of each account were used for profiling. Remaining days were used to generate 96,000 account-days. Distinct training and testing accounts:10,000 account-days for training; 5000 for testing 20% fraud days and 80% non-fraud days + Experiments and Evaluation + 39 Output of DC-1 components  Rule  Each covering at least two accounts  Rule 2 learning: 3630 rules selection: 99 rules monitor templates yielding 198 monitors  Final feature selection: 11 monitors + 40 The Importance Of Error Cost  Classification accuracy is not sufficient to evaluate performance  Should take misclassification costs into account  Estimated   Error Costs: False positive(false alarm): $5 False negative (letting a fraudulent account-day go undetected): $0.40 per minute of fraudulent air-time  Factoring in error costs requires second training pass by LTU + 41 Alternative Detection Methods  Collisions  + Velocities Errors almost entirely due to false negatives  High Usage – detect sudden large jump in account usage  Best  Individual DC-1 Monitor (Time-of-day = Evening) ==> Fraud  SOTA   - State Of The Art Incorporates 13 hand-crafted profiling methods Best detectors identified in a previous study 42 DC-1 Vs. Alternatives Detector Accuracy(%) Cost ($) Accuracy at Cost Alarm on all 20 20000 20 Alarm on none 80 18111 +/- 961 80 Collisions + Velocities High Usage 82 +/- 0.3 17578 +/- 749 82 +/- 0.4 88+/- 0.7 6938 +/- 470 85 +/- 1.7 Best DC-1 monitor 89 +/- 0.5 7940 +/- 313 85 +/- 0.8 State of the art (SOTA) DC-1 detector 90 +/- 0.4 6557 +/- 541 88 +/- 0.9 92 +/- 0.5 5403 +/- 507 91 +/- 0.8 SOTA plus DC-1 92 +/- 0.4 5078 +/- 319 91 +/- 0.8 + 43 Shifting Fraud Distributions  Fraud detection system should adapt to shifting fraud distributions To illustrate the above point  One non-adaptive DC-1 detector trained on a fixed distribution ( 80% non-fraud ) and tested against range of 75-99% non-fraud  Another DC-1 was allowed to adapt (re-train its LTU threshold) for each fraud distribution  Second first detector was more cost effective than the 44 Cost Effects of Changing Fraud Distribution 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Adaptive 80/20 75 80 85 90 95 Percentage of non-fraud 100 + 47 Conclusion  DC-1 uses a rulelearning program to uncover indicators of fraudulent behavior from a large database of customer transactions.  Then the indicators are used to create a set of monitors, which profile legitimate customer behavior and indicate anomalies.  Finally, the outputs of the monitors are used as features in a system that learns to combine evidence to generate highconfidence alarms. + 48 Conclusion  Adaptability to dynamic patterns of fraud can be achieved by generating fraud detection systems automatically from data, using data mining techniques  DC-1 can adapt to the changing conditions typical of fraud detection environments  Experiments indicate that DC-1 performs better than other methods for detecting fraud 49 + Exam Questions + 50 Question 1 • What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under? • Pre Call Methods • • Involves validating the phone or its user when a call is placed. Post Call Methods – DC1 falls here • Analyzes call data on each account to determine whether cloning fraud has occurred. + 51 Question 2 • Why do fraud detection methods need to be adaptive? • Bandits change their behavior- patterns of fraud dynamic • Levels of fraud varies month-to-month • Cost of missing fraud or handling false alarms changes between inter-carrier contracts + 52 Question 3 • What are the two steps of profiling monitors and and what are the two main monitor templates? • Profiling Step: measure an accounts normal activity and save statistics • Use Step: process usage for an account-day to produce a numerical output describing how abnormal activity was on that account-day • Threshold and Standard Deviation monitors. 53 + The End. Questions?

Here

Related documents

Products

Support

Here

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib