Abstract for ISIPS 2008 The Requirements, Tasks and Solutions for a Privacy-Preserving Counterintelligence System By Jesus Mena, Chief Strategy Officer InferX Corporation The Requirements: Collective Inferencing for Investigative Profiling Suspicious scoring of avoidance behavior exists for counterintelligence and investigative data mining. Pattern recognition and profiling is accomplished through suspicious scoring under uncertainty via collective inferencing. In collective inferencing seemingly meaningless data becomes meaningful by drawing inferences about everyone simultaneously without reliance on guilt-by-association algorithms. In this type of analysis, shadowy and unusual interactions and behavior are searched in order to generate risk scores from transactions, conditions, events and sequences – as well as drawing out deviations from normal behavior. A self-adaptive counterintelligence system for collective inferencing The Tasks: Anonymous Text and Data Mining via Software Agents Risk networks can be created for the identification of suspicious behavior patterns for the creation of self-adaptive counterintelligence systems. Profiles are created from multiple data sources in a totally anonymous manner without the need to centralize or move any data. These techniques improve detection of threats and events where an embedded entity is not available for social networking analysis for the discovery of ‘who knows whom where and when’ via the use of guilt-by-association algorithms. Modeling algorithms can be used to discover suspicious patterns, outlier or anomaly behavior as well as for the extraction of key concepts from text. The distillation of intelligence from structured databases and unstructured documents with concept extractors The Solutions: Advanced Analytics for Counterintelligence Current advanced analytics and anonymous data networking technologies exists which allows for the ability to analyze, structured databases and unstructured documents, email and clickstream data for preemptive counterintelligence and reactive law enforcement investigations. These investigative techniques use networks coupled with data and text mining algorithms to discover suspicious behavior from multiple databases in a totally privacy preserving manner. The information analyzed may reside in different computer systems in various data formats located anywhere in the world since the investigative analyzes and modeling can take place in real-time over networks using anonymous pointers to the original location of the data. The following graphic offers a view of a distributed analytic suite of tools which combine algorithms for clustering, text and predictive analytics from multiple data sources from different locations and formats: A web-centric privacy-protecting software solution to counterintelligence Presenter: Jesus Mena is the curator of the “Web Mining” topic at Scholarpedia.org, he is the author of five data mining books including “Homeland Security Techniques and Technologies” and “Investigative Data Mining for Security and Criminal Detection” he has consulted with Sandia Labs, the National Counter Terrorism Center and was the data mining contractor in the first department-wide audit of all analytical systems at Department of Homeland Security for their Office of Inspector General prior to joining InferX as its Chief Strategy Officer.