Sensor Management Problems of Nuclear Detection – Layered Defense Fred S. Roberts Rutgers University 1 Multi-disciplinary, Multi-institutional Project •Based at Rutgers University •Partners at Princeton, Texas State University – San Marcos •Collaborators at LANL, PNNL, Sandia 2 Much of this work takes place at CCICADA Founded 2009 as a DHS University Center of Excellence – the DHS CCI COE based at Rutgers 3 Key Underlying Project Themes •New developments in hardware are important in nuclear detection/prevention, but so are new algorithms, models, and statistical methods •Nuclear detection/prevention involves sorting through massive amounts of information •We need ways to make use of as many sources of information as possible. 4 Research Thrusts: Recent Work 1. Tools for Risk Assessment and Anomaly Detection 2. Layered Defense 5 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection a. Risk Scoring of Containers b.Visualization of Data c. Machine Learning to Distinguish Threat from non-Threat Radiation Visualization of Port to Port Shipments 6 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Container Risk Scoring: – We looked at a year’s worth of manifest data from container ships – every Wed. – Goal: Identify mislabeled or anomalous shipments through scrutiny of a manifest data 7 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Container Risk Scoring: – Used our penalized regression scoring to identify risk scores and patterns or time trends in variables. – Emphasis on relationships among container shipment contents, port of origin and destination, carrier, etc. 8 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Container Risk Scoring: – Looked at manifest data from before and after the Japanese tsunami. Expect to find differences. Credit: National Geographic News 9 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Container Risk Scoring: – Looked at manifest data from before and after the Japanese tsunami. Expect to find differences. – Found that pattern of frequency data based on “domestic port of unlading” is statistically different before and after the tsunami. – But the pattern based on distribution of carrier is not – Conclusion: Don’t depend on just one variable to uncover anomalies. 10 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Visualization of Manifest Data: – Data visualization is a powerful new area of research enabling rapid insight into patterns and departures from patterns – Analyzed relationships among container shipment contents, foreign port of origin and US destination port 11 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Visualization of Manifest Data: – Encoded shipment information as weighted timevariant graphs amenable to fast stream processing and visualization 12 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Visualization of Manifest Data: – Developed novel representation of manifest data amenable to fast visualization and processing 13 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Visualization of Manifest Data: – Developed novel algorithm based on “combinatorial discrepancy” to detect anomalous traffic in manifest data 14 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Machine Learning to Distinguish Threat from non-Threat Radiation – Goal: distinguish non-threat sources of radiation from threat materials and identify an isotope. – Compared machine learning Topic Modeling algorithms: recently-popularized Higher Order Latent Dirichlet Allocation (H0-LDA) vs. traditional LDA. 15 Research Thrusts: Recent Work Research Thrust 1: Tools for Risk Assessment and Anomaly Detection: Recent Highlights • Machine Learning to Distinguish Threat from non-Threat Radiation – Learning based on data set of 302 spectra including 17 isotopes and background. – Analyze gamma-ray spectra generated by CZTbased handheld detectors – Comparing HO-LDA to traditional LDA. – Concentrated on GA67, I131, In111, Tc99m – HO-LDA performed statistically significantly better than LDA 16 Research Thrusts: Recent Work Research Thrust 2: Layered Defense Target 17 Research Thrust 2: Layered Defense • We have formulated a model of how to locate nuclear surveillance in the area around a facility, e.g., roadways and walkways approaching sports stadiums. 18 Layered Defense • This relates to a CCICADA project in connection with the National Football League. • Developing simulation models for evacuation of stadiums. 19 Layered Defense To develop our ideas, we have formulated a model of a “perimeter” defense of the target with several layers of defense: •Limited budget for surveillance •How much to invest in each layer? •Defense at outer layers might be less successful but could provide useful information to selectively refine and adapt strategies at inner layers. •Arranging defense in layers so decisions can be made sequentially might significantly reduce costs and increase chance of success. 20 Layered Defense Abstract model of layered defense: • Target in middle • Threats arrive via 4 inner channels • Each combines 2 outer outer flows of vehicles, persons, etc. Target 21 Layered Defense Abstract model of layered defense: • Fixed budget for outer layer and for inner layer defense • Can choose among detectors with different characteristics and costs • How optimize probability of detection? Target 22 Layered Defense Different models for: • Flow along different paths • Prob. of detection at different locations (outer, inner) • Allowable modifications of inner defense strategies based on outer layer results Target 23 Layered Defense • Monitoring at outer layer not only hinders an attacker but can provide information about current state of threat that can be used to refine and adapt strategies at inner layers. • There is a complex tradeoff between maximizing the cost-effectiveness of each layer and overall benefits from devoting some efforts at the outer layer to gathering as much information as possible to maximize effectiveness of the inner layer. • We have formulated this as an optimization problem. 24 General Formulation: Outer layer(s) plus inner layer(s) – paths of approach 25 General Formulation: Outer layer(s) plus inner layer(s) – paths of approach Model Assumptions: First Model: •Each incoming path u has a dangerous “flow” Fu •At each sensor k, the probability of detection is a function Dk(Rk) of the resources Rk allocated to that sensor. •Assume that Dk(Rk) is a concave, piecewise linear function. 26 General Formulation: Outer layer(s) plus inner layer(s) – paths of approach Model Assumptions: First Model Special Case: The Case of Two Layers •Assume that the outside layers share a limited resource budget and so do the inside layers. •More subtle models allow one to make decisions about how much budget to allocate between inside and outside. •Goal: Allocate the total outside resources among individual sensors and allocate the total inside resources among individual sensors in order to 27 maximize the illegal flow detected. General Formulation: Outer layer(s) plus inner layer(s) – paths of approach Model Assumptions: First Model Special Case: The Case of Two Layers •Goal: Allocate the total outside resources among individual sensors and allocate the total inside resources among individual sensors in order to maximize the illegal flow detected. •Note: So far, this model does not have the random allocation of resources to sensors that we ultimately aim for to confuse the attacker. That is an added component for future work. 28 General Formulation: Outer layer(s) plus inner layer(s) – paths of approach Model Assumptions: First Model Special Case: The Case of Two Layers •Since there are only 2 layers, we can identify the path name with the outer layer sensor where it begins. •Thus, path u is the path beginning at outer sensor u. 29 The Case of Two Layers Dangerous flow captured at outside sensor j Dangerous flow not captured at outside sensor j that is captured at inside sensor i 30 Solving the Optimization Problem •This formulates the problem as a non-linear optimization problem. •A standard approach to such problems is a brute force approach that fixes a resource “mesh”size and enumerates all possibilities. – Discretize the resource space for each sensor into subintervals – Examine every possible resource allocation •That approach is not computationally feasible for the problem as we have formulated it. •We have developed a new approach to solving 31 the problem in our context. Solving the Optimization Problem •We have developed a new approach to solving the problem in our context. •Still discretize the resource space for interior sensors into subintervals and solve that. •However, we can now find the optimal configuration for the exterior sensors by solving a linear programming problem for each combination of interior and exterior sensors. •An improvement, but this is still too computationally intensive. •However, a dynamic programming variant 32 avoids the worst part of the computation. Methods Solve Some Special Cases Detection network architecture First assumption: linear detection rates both inside and outside 33 Changing the detection rate function 34 •Our methods for this simple problem as well as the more complex problems we will describe were applied on a simple AMD Phenom X4 9550 workstation with 6GB of DDR2 RAM, and were often solved in a matter of seconds. 35 A more complicated network: Multiple outside sensors Case of 2 Outside sensors (green and blue) and 1 inside sensor Piecewise linear detection rate functions 36 A more complicated network: Multiple outside and multiple inside sensors 37 •Our methods generalize to this case. •Even with 4 inside sensors and 2 outside sensors per inside sensor, solution in < 2 minutes on modest workstation. 38 Solution with 4 inside sensors and 2 outside sensors per inside sensor •Solution “tableau” includes10,302 distinct points. •Solution in < 2 minutes on modest workstation. •Methods feasible up to 10 inside sensors. •After that, need approximation methods. 39 Case of an Adaptive Adversary •So far, our model assumed a fixed flow of dangerous material on each pathway. •What if we have an adaptive adversary who recognizes how much of a resource we use for sensors on each node and then chooses the path that minimizes the probability of detection? •To defend against such an adversary we might seek to assign sensor resources so as to maximize the minimum detection rate on any path. 40 The Problem for Two Layers with an Adaptive Adversary 41 The Case of Two Layers with an Adaptive Adversary •We have developed methods that work with multiple inside sensors and multiple outside sensors 42 Solution with 4 inside sensors and 2 outside sensors per inside sensor •Solution “tableau” had 40,401 distinct points. •Solution in 3102 seconds (52 minutes) on modest workstation. •Hope to be able to speed up so methods feasible for up to 10 inside sensors. •After that, need approximation methods. 43 Testing Layered Defense Ideas at NFL Stadiums • Working with NFL stadiums • Looking at variety of inspection problems, not just nuclear detection. • Gathering data about how they do layered defense and building simulation models 44 Testing Layered Defense Ideas at NFL Stadiums • Model for inspection: – Assume all basic inspection methods perform like M/M/1 queues (inter-arrival times and service times are exponentially distributed) – Studying a variety of different kinds of inspections – Five measures of effectiveness: • • • • • Detection rate False alarm rate Monetary cost Throughput Average waiting time 45 Testing Layered Defense Ideas at NFL Stadiums • Model for inspection: – Comparing different kinds of strategies • Mixed strategy: Execute inspection strategy Ai on fraction xi of people • Layered strategy: Execute strategy A for everyone; then strategy B on those who test positive and strategy C on those who test negative • Distributed strategy: Split the current queue for strategy A into a k-multiserver queue for strategy A • Randomization strategy: if you can’t inspect everyone. 46 Testing Layered Defense Ideas at NFL Stadiums • Model for inspection: – For layered strategies: – Have developed an algorithm for finding the convex hull of “dominating strategies that: Satisfy some conditions such as maximize detection rate and minimize false alarm rate and monetary cost subject to constraints on maximum cost and minimum throughput. – Algorithm runs in a few seconds if maximum 2 layers, takes 30 minutes for 3 layers. 47 Testing Layered Defense Ideas at NFL Stadiums • In practice: Looking at three types of inspection: – Wanding – Pat-down – Bag inspection • Observing stadium inspections and gathering data about each type of inspection, in particular length of time it takes. • Data shows major differences depending on inspector, time before kickoff, etc. 48 Testing Layered Defense Ideas at NFL Stadiums • Working with NFL stadiums wanding 49 Testing Layered Defense Ideas at NFL Stadiums • Also looking at doing ticket scans first – as an extra layer of inspection wanding 50 • Rutgers University – – – – – – – – – – – – – – – – – – – – – – • Fred Roberts James Abello Tsvetan Asamov (grad student) Endre Boros MIkey Chen (undegrad) Jerry Cheng (grad student) Sid Dalal (RAND Corp, consultant) Robert Davis (undergrad student) Emilie Hogan (grad student) Christopher Janneck (grad student) Paul Kantor Adam Marszalek (grad student) Dimitris Metaxas Christie Nelson (grad student) Alantha Newman (postdoc) Neel Parikh (undergrad) Jason Perry (grad student) Bill Pottenger Brian Thompson (grad student) Minge Xie Emre Yamangil (grad student) Stavros Zinonos (grad student) Princeton University – – – – – • Project Team Warren Powell Savas Dayanik Peter Frazier (grad student) Ilya Rhyzov (grad student) Kazutoshi Yamazaki (grad student) Texas State University – San Marcos – – Nate Dean Jill Cochran (grad student) 51 Project Team: National Lab Partners (helping with advice, information, data) • PNNL – Terence Critchlow – James Ely – Cliff Joslyn • LANL – Frank Alexander – Nick Hengartner • Sandia – Jon Berry – Bill Hart 52 Thank you 53 Title: Sensor Management Problems of Nuclear Detection Org/PI: Rutgers University / Fred S. Roberts Thrust 1: Tools for threat detection and risk assessment Isotope ID Manifest data Through analysis Machine learning Risk scoring for containers Broader Impact Thrust 2: layered defense Layered defense Technical Merit Our project focuses on managing and mitigating uncertainty for improved collection and interpretation of sensor data while exploiting randomness for unpredictable surveillance Classification methods tailored to radiation sensor data can reduce nuisance alarms; new methods for analyzing manifest data lead to anomaly detection and risk scoring; layered surveillance to thwart adversaries Technical Approach Risk scoring methods; visualization for anomaly detection; machine learning for isotope identification; optimization and simulation for layered defense Students supported: postdoc (1); graduate (10); undergrad (3) Part of (3) PhD dissertations to date; (3) more nearing completion More than (10) additional graduate students participating Developed new undergraduate course on “Optimal Learning” at Princeton University, with related textbook in progress Held workshop involving five projects in the DNDO program + Fall 2010 workshop on adversarial decision making Enhanced relations w/ national labs, incl. (2) summer internships Many project methods apply to other fields: e.g. machine learning methods are being applied for police force deployment; layered defense to NFL games (30) papers published/accepted; (11) more under review Schedule/Cost: PY01: $486K Duration: 48 months PY02: $491K 44 months (to date) Major Milestones / Accomplishments PY03: $494K PY04+05: $529K Developed machine learning tools for risk scoring and isotope classification, esp. higher-order methods and preprocessing tools; new statistical methods for risk scoring & new split & conquer algorithm for larger data sets; visualizations to observe patterns in manifest data rapidly pinpoint anomalies; novel models of layered defense Team Co-PI: Warren Powell, Princeton University Collaborating Universities: Princeton University; Texas State University – San Marcos National Labs interaction: PNNL; LANL; Sandia; LLNL 54 Last updated on: 07/20/12