Jeff Jonas, IBM Distinguished Engineer June 21, 2010 Insider Threat Intelligence Weak Signal Detection to Deal With the Bad Guy Within 1 © 2010 IBM Corporation Meet Your Speakers KEYNOTE SPEAKER Jeff Jonas Chief Scientist, IBM Entity Analytics IBM Distinguished Engineer MODERATOR Charles Palmer Director, Institute for Advanced Security Chief Technologist of Cybersecurity and Privacy, IBM 2 © 2010 IBM Corporation About This Presentation 1. Principles about data at scale and detection of weak signal 2. Insider threat specifics 3 © 2010 IBM Corporation Background 1983: Founded Systems Research & Development (SRD) 1992: Assisted Vegas casinos in detecting the subjects of interest – resulting in a technology known as Non-Obvious Relationship Awareness (NORA) 1996: Created an identity-centric customer repository based on 4,200 disparate systems … >100 million resolved identities 2001: First technology funded by In-Q-Tel, the venture capital arm of the CIA 2003: Demoted self, hired a CEO 2005: SRD acquired by IBM, now Chief Scientist, IBM Entity Analytics Today: Focus is in the area of ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections 4 © 2010 IBM Corporation ”The data must find the data … and the relevance must find the user.” 5 © 2010 IBM Corporation Macro Trend 6 © 2010 IBM Corporation Avg Age Good News: The World is Not More Dangerous 67 37 1900: Western Europe 7 Today: Global Average © 2010 IBM Corporation Avg Age Good News: The World is Not More Dangerous 37 1900: Western Europe Today: Global Average Number Dead 67 75M ~17+% 300M ~4.5% 1300’s: “Black Death” 8 Today: If America sunk into ocean and everyone dies © 2010 IBM Corporation “More Death Cheaper in Future” Graph Execution Complexity 1st Nuke (130,000 people, $37B) 1945: ≈ 140,000 deaths 9 © 2010 IBM Corporation 1918 Spanish Influenza Genome 10 © 2010 IBM Corporation “More Death Cheaper in Future” Graph Execution Complexity 1st Nuke (130,000 people, $37B) Re-animation of 1918 Spanish Influenza (<50 people, <$100k) 1945: ≈ 140,000 deaths 11 Today: ≈ 160,000,000 deaths © 2010 IBM Corporation “More Death Cheaper in Future” Graph Execution Complexity 1st Nuke (130,000 people, $37B) Re-animation of 1918 Spanish Influenza (<50 people, <$100k) 1945: ≈ 140,000 deaths 12 Today: ≈ 160,000,000 deaths BAD! © 2010 IBM Corporation Jerome Kerviel – US$7B www.chinapost.com.tw/news_images/20080127/p1d.jpg 13 © 2010 IBM Corporation French Bank: Societe Generale’s Back it out Reinstate it Back it out Analytic Checkpoint Analytic Checkpoint 14 Reinstate it 1 Day © 2010 IBM Corporation Computing Power Growth Trend: Organizations Are Getting Dumber Available Observation Space Context Enterprise Amnesia Sensemaking Algorithms Time 15 © 2010 IBM Corporation Amnesia is Embarrassing Amnesia is Expensive 16 © 2010 IBM Corporation Computing Power Growth Trend: Organizations Are Getting Dumber Available Observation Space WHY? Context Sensemaking Algorithms Time 17 © 2010 IBM Corporation Algorithms at Dead End. You Can’t Squeeze Knowledge Out of a Pixel. 18 © 2010 IBM Corporation Without Context scrila34@msn.com 19 © 2010 IBM Corporation Risk Triage Consequences Alert queues growing faster than the humans address The top item in the queue is not the most relevant item Items require so much investigative effort – they are often abandoned prematurely Risk assessment becomes the risk 20 © 2010 IBM Corporation Information without context is hardly actionable. 21 © 2010 IBM Corporation Context, definition of: Better understanding something by taking into account the things around it. 22 © 2010 IBM Corporation Context Accumulation Job Applicant Identity Thief 23 Top 500 Customer Fraud Investigation © 2010 IBM Corporation Context Accumulation scrila34@msn.com Job Applicant Identity Thief 24 Top 500 Customer Fraud Investigation © 2010 IBM Corporation Context Accumulation: Think Pixels to Pictures An assertion is made with the arrival of each new piece (connects, near like neighbors, or un-associated) Assertions favor the false negative New connecting pieces … create new entities that are reevaluated against previously observed entities Some pieces produce novel discovery New pieces sometimes reverse earlier assertions Working space quickly begins to exceed the final size There can come a tipping point – a collapsing of the working space 25 © 2010 IBM Corporation Unique Identities False Negatives Overstate The Universe True Population Observations 26 © 2010 IBM Corporation Counting Is Difficult Mark Smith 6/12/1978 443-43-0000 Mark R Smith (707) 433-0000 DL: 00001234 File 2 File 1 27 © 2010 IBM Corporation Counting: Degrees of Difficulty Deceit Bob Jones Ken Wells 123455 550119 Incompatible Features Fuzzy Exactly Same Bob Jones 123455 28 Bob Jones 123455 Bob Jones 123455 bjones@hotmail Robert T Jonnes 000123455 Bob Jones 123455 © 2010 IBM Corporation Unique Identities The Rise and Fall of a Population True Population Observations 29 © 2010 IBM Corporation Data Triangulation New Record Mark Smith 6/12/1978 443-43-0000 Mark R Smith (707) 433-0000 DL: 00001234 Mark Randy Smith 443-43-0000 DL: 00001234 File 2 File 1 30 © 2010 IBM Corporation “Expert Counting” is Fundamental to Prediction Is it 5 people each with 1 account … or is it 1 person with 5 accounts? If one cannot count … one cannot estimate vector or velocity (direction and speed). Without vector and velocity … prediction is nearly impossible. Therefore, if you can’t count, you can’t predict. 31 © 2010 IBM Corporation And, Oh By The Way … Deceit Revealed Unique Identities 6 Liars Busted Here! True Population Observations 32 © 2010 IBM Corporation Take Note To catch clever insiders … one must collect observations they don’t know you have. 33 © 2010 IBM Corporation Case Study: Las Vegas Casino Data Sources Detected Relationships • 20,000 plus employees • 24 active players were known cheaters • All vendors • All slot club & table games-related players • In-house arrests/ incidents • Known cheaters 34 • 23 players had relationships to prior arrests/incidents • 12 employees were themselves the player • 192 employees had possible vendor relationships • 7 employees were the vendor © 2010 IBM Corporation Case Study: US Federal Agency Data Sources Detected Relationships • 20,000 plus employees • 140 employee relationships to vendors • 75,000 plus vendors • 200,000 plus Type 1 security risk entities • 200,000 plus Type 2 security risk entities • 1451 potential vendor relationships to security risks • 253 employee relationships to security risk entities • 2 vendors were the security risk • “n” employees were the security risk/vendor 35 © 2010 IBM Corporation “The Data is the Query” Beats “Boil the Ocean” Members Database Employees Database Arrests Database Batch Analytics 36 © 2010 IBM Corporation rd 3 Principle Enterprise awareness is computationally most efficient when performed at the moment the observation is perceived. 37 © 2010 IBM Corporation Insider Threat Detection 38 © 2010 IBM Corporation About Insider Threats Insiders know your policies, procedures, and people. With this in mind they can go to great length to avoid detection Insider threats can manifest in a number of ways, including: – Walking out with data – Embedding bad systems, bad processes in the organization’s infrastructure – Providing third parties critical vulnerability knowledge – Non-business justified data alteration – Peeping at data out for some non-business, personal interest 39 © 2010 IBM Corporation Data Leakage Access used to obtain a copy of the data – USB thumb drive – CD – Printed report – Photo of a screen What to do? – Disable writable devices at desktop – Immutable audit logs to chill inappropriate behavior – Invasive searches at entrance and exit 40 © 2010 IBM Corporation Embedding Bad Systems, Bad Processes Access used to infrastructure leads to – Malicious code – Malicious devices – Altered processes What to do? – Code reviews – Internal QA process – Oversight 41 © 2010 IBM Corporation Sharing Vulnerabilities with 3rd Parties Knowledge of vulnerabilities shared with others – Organized crime – Other governments – Competitors What to do? – A very tough one to solve 42 © 2010 IBM Corporation Data Alteration Adding, changing or deleting records outside of business policy – Adding information – Deleting derogatory data – Changing an account status What to do? – Immutable audit logs (for the chilling effect) – Random user activity audits (for the chilling effect) – Active audits detecting and alerting suspect user searches 43 © 2010 IBM Corporation Peeping Did an employee with privileges look at records without a proper business purpose (e.g., VIPs, politicians, executives, or simply people in their immediate social circle) – Personal curious – Inform others – For sale What to do? – Immutable audit logs (for the chilling effect) – Random user activity audits (for the chilling effect) – Active audits detecting and alerting suspect user searches 44 © 2010 IBM Corporation Current Skunk Works Effort A real-time active audit log to detect and alert on inappropriate data alteration and peeping Compares user searches, access, or data changes that affect records of – VIPs, politicians, executives, famous or suddenly famous – Themselves – Those close to them (family, neighbors) – Those in their social circle PS: Looking for organizations interested in being early victims of a first generation of such 45 © 2010 IBM Corporation Closing Thoughts 46 © 2010 IBM Corporation It’s all about competition. 47 © 2010 IBM Corporation To Beat the Competition … Human Capital Fastest Sensemaking First Data 48 Tools © 2010 IBM Corporation “Every millisecond gained in our program trading applications is worth $100 million a year.” Goldman Sachs, 2007 * Source Automated Trader Magazine 2007 49 © 2010 IBM Corporation Computing Power Growth Wish This On The Enemy Available Observation Space Context Enterprise Amnesia Sensemaking Algorithms Time 50 © 2010 IBM Corporation Computing Power Growth Enterprise Intelligence: The Way Forward Available Observation Space Context Context Accumulation Sensemaking Algorithms Time 51 © 2010 IBM Corporation Some Related Blog Posts Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel Puzzling: How Observations Are Accumulated Into Context Data Finds Data Federated Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later How to Use a Glue Gun to Catch a Liar It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You When Risk Assessment is the Risk 52 © 2010 IBM Corporation Jeff Jonas, IBM Distinguished Engineer June 21, 2010 Insider Threat Intelligence Weak Signal Detection to Deal With the Bad Guy Within 53 © 2010 IBM Corporation