The Bioterrorism Sensor Location Problem • Early warning is critical • This is a crucial factor underlying government’s plans to place networks of sensors/detectors to warn of a bioterrorist attack The BASIS System Locating Sensors is not Easy • Networks of sensors are expensive • Ways to locate them to maximize “coverage” and expedite an alarm are not easy to determine • Approaches that improve upon existing, ad hoc location methods could save countless lives in the case of an attack and also money in capital and operational costs. Two Fundamental Problems • Sensor Location Problem (SLP): – Choose an appropriate mix of sensors – decide where to locate them for best protection and early warning Two Fundamental Problems • Pattern Interpretation Problem (PIP): When sensors set off an alarm, help public health decision makers decide – Has an attack taken place? – What additional monitoring is needed? – What was its extent and location? – What is an appropriate response? Two Fundamental Problems • The SLP and PIP are ripe for: – Precise formulation – Mathematical modeling – Algorithmic analysis – Applications of powerful new statistical methods Two Fundamental Problems • Relevant tools include: – Network design – Network analysis – Location theory – Reliability theory – Data mining – Fluid dynamic modeling – Source-to-dose modeling – Time series analysis Types of Sensors • There are many types of sensors. Portal shield Bioparticulate counter/detector Dry filtration unt (portable) The SLP • Sensor technology is changing rapidly • Sensors come with a variety of characteristics • A good sensor location methodology should not be dependent upon particular sensor technologies. The SLP: What is a Measure of Success of a Solution? • A modeling problem. • Needs to be made precise. • Many possible formulations. The SLP: What is a Measure of Success of a Solution? • Identify and ameliorate false alarms. • Defending against a “worst case” attack or an “average case” attack. • Minimize time to first alarm? (Worst case? (Average case?) • Maximize “coverage” of the area. – Minimize geographical area not covered – Minimize size of population not covered – Minimize probability of missing an attack The SLP: What is a Measure of Success of a Solution? •Cost: Given a mix of available sensors and a fixed budget, what mix will best accomplish our other goals? The SLP: What is a Measure of Success of a Solution? •It’s hard to separate the goals. •Even a small number of sensors might detect an attack if there is no constraint on time to alarm. •Without budgetary restrictions, a lot more can be accomplished. Modeling Issues: Types of Sensors • Sensor technology is changing rapidly. • Methods we develop should not be dependent upon Much of present technology today’s depends upon hand-held rapid technology. PCR assay together with software for BW agent identification Modeling Issues: Types of Sensors Sensors differ as to: – Response – Accuracy and reliability – Stationarity vs. mobility – Constraints on their location – Cost – How sensor data is reported Reporting of Sensor Data • Do humans physically examine collection devices? Reporting of Sensor Data • Is the data electronically reported? • Reporting at discrete times? • Reporting continuously? Other Relevant Modeling Issues • Probability of Release at Different Locations • Geography • Buildings • Weather • Population Distribution Algorithmic Approaches I : Greedy Algorithms Greedy Algorithms • Find the most important location first and locate a sensor there. • Find second-most important location. • Etc. • Builds on earlier work at IDA (Grotte, Platt) • Steepest ascent approach. • No guarantee of optimality. • In practice, peak of objective function is rather flat, so not hard to get close to optima. Algorithmic Approaches II : Variants of Classic Location and Clustering Methods Algorithmic Approaches II : Variants of Classic Location and Clustering Methods • Location theory: locate facilities (sensors) to be used by users located in a region. • Cluster analysis: Given points in a metric space, partition them into groups or clusters so points within clusters are relatively close. • Clusters correspond to points covered by a facility (sensor). Variants of Classic Location and Clustering Methods • k-median clustering: Given k sensors, place them so each point in the city is within x feet of a sensor. • Complications: More dimensions: location affects sensitivity, wind strength enters, sensors have different characteristics, etc. • This higher-dimensional k-median clustering problem is hard! Best-known algorithms are due to Rafail Ostrovsky. Variants of Classic Location and Clustering Methods • Further complications make this even more challenging: – Different costs of different sensors – Restrictions on where we can place different sensors – Is it better to have every point within x feet of some sensor or every point within y feet of at least three sensors (y > x)? • Approximation methods due to Chuzhoy, Ostrovsky, and Rabani and to Guha, Tardos, and Shmoys are relevant. Algorithmic Approaches III : Variants of Highway Sensor Network Algorithms Variants of Highway Sensor Network Algorithms • Sensors located along highways and nearby pathways measure atmospheric and road conditions. • Muthukrishnan, et al. have developed very efficient algorithms for sensor location. • Based on “bichromatic clustering” and “bichromatic facility location” (color nodes corresponding to sensors red, nodes corresponding to sensor messages blue) Variants of Highway Sensor Network Algorithms • These algorithms apply to situations with many more sensors than the bioterrorism sensor location problem. • As BT sensor technology changes, we can envision a myriad of miniature sensors distributed around a city, making this work all the more relevant. Algorithmic Approaches IV : Variants of Air Pollution Monitoring Models Variants of Air Pollution Monitoring Models • Long history of using models to locate air pollution monitors. • MENTOR: Modeling Environment for Total Risk; developed by team at Rutgers and R.W.J. Medical School (Panos Georgopoulos, Paul Lioy) at Center for Exposure and Risk Modeling Variants of Air Pollution Monitoring Models • MENTOR builds on – – – – personal exposures Source-to-dose modeling Environmental conditions Weather • MENTOR is a powerful computational tool. • However, the models it uses are not nearly as large or as complex as those traditionally used in nuclear weapons research at Los Alamos and elsewhere. Variants of Air Pollution Monitoring Models • Combine air pollution monitor placement modeling tools like MENTOR with iteration/simulation tools. • Piecewise heuristic approach developed by Tom Boucher, David Coit, E. Elsayed • Based on initial simulation results, divide problem into subproblems and repeat local optimization algorithms • Method recently applied to counterterrorism applications by Pate-Cornell. Algorithmic Approaches V : Building on Equipment Placing Algorithms Building on Equipment Placing Algorithms • The “Node Placement Problem” is problem of determining locations or nodes to install certain types of networking equipment. • “Coverage” and cost are a major consideration. • Researchers at Telcordia Technologies have studied variations of this problem arising from broadband access technologies. The Broadband Access Node Placement Problem • There are inherent range limitations that drive placement. • E.g.: customer for DSL service must be within xx feet of an assigned multiplexer. • Multiplexer = sensor. • Problem solved using dynamic programming algorithms. (Tamra Carpenter, Martin Eiger,David Shallcross, Paul Seymour) The Broadband Access Node Placement Problem: Complications • Restrictions on types of equipment that can be placed at a given node. • Constraints on how far a signal from a given piece of equipment can travel. • Cost and profit maximization considerations. • Relevance of work on general integer programming, the knapsack cover problem, and local access network expansion problems. The Pattern Interpretation Problem The Pattern Interpretation Problem • It will be up to the Decision Maker to decide how to respond to an alarm from the sensor network. The Pattern Interpretation Problem • Little has been done to develop analytical models for rapid evaluation of a positive alarm or pattern of alarms from a sensor network. • How can this pattern be used to minimize false alarms? • Given an alarm, what other surveillance measures can be used to confirm an attack, locate areas of major threat, and guide public health interventions? The Pattern Interpretation Problem (PIP) • Close connection to the SLP. • How we interpret a pattern of alarms will affect how we place the sensors. • The same simulation models used to place the sensors can help us in tracing back from an alarm to a triggering attack. Approaching the PIP: Minimizing False Alarms Approaching the PIP: Minimizing False Alarms • One approach: Redundancy. Require two or more sensors to make a detection before an alarm is considered confirmed. Approaching the PIP: Minimizing False Alarms • Portal Shield: requires two positives for the same agent during a specific time period. • Redundancy II: Place two or more sensors at or near the same location. Require two proximate sensors to give off an alarm before we consider it confirmed. • Redundancy drawbacks: cost, delay in confirming an alarm. Approaching the PIP: Using Decision Rules • Existing sensors come with a sensitivity level specified and sound an alarm when the number of particles collected is sufficiently high – above threshold. Approaching the PIP: Using Decision Rules • Alternative decision rule: alarm if two sensors reach 90% of threshold, three reach 75% of threshold, etc. • One approach: use clustering algorithms for sounding an alarm based on a given distribution of clusters of sensors reaching a percentage of threshold. Approaching the PIP: Using Decision Rules • When sensors are to be used jointly, the rules for “tuning” each sensor should be optimized to take advantage of the fact that each is part of a network. • The optimal tuning depends on the decision rule applied to reach an overall decision given the sensor inputs. Approaching the PIP: Using Decision Rules • Prior work along these lines in missile detection (Cherikh and Kantor) Approaching the PIP: Using Decision Rules • Most work has concentrated on the case of stochastic independence of information available at two sensors – clearly violated in BT sensor location problems. • Even with stochastic independence, finding “optimal” decision rules is nontrivial. • Recent promising approaches of Paul Kantor: study fusion of multiple methods for monitoring message streams. Approaching the PIP: SpatioTemporal Mining of Sensor Data Approaching the PIP: SpatioTemporal Mining of Sensor Data • Sensors provide observations of the state of the world localized in space and time. • Finding trends in data from individual sensors: time series data mining. • PIP: detecting general correlations in multiple time series of observations. • This has been studied in statistics, database theory, knowledge discovery, data mining. • Complications: proximity relationships based on geography; complex chronological effects. Approaching the PIP: SpatioTemporal Mining of Sensor Data • Sensor technology is evolving rapidly. • It makes sense to consider idealized settings where data are collected continuously and communicated instantly. • Then, modern methods of spatio-temporal data mining due to Muthukrishnan and others are relevant. Approaching the PIP: SpatioTemporal Mining of Sensor Data Work on Cellular networks and IP networks is relevant. Approaching the PIP: SpatioTemporal Mining of Sensor Data • There is relevant work of Muthukrishnan on cellular and IP networks: – Time-of-day effects in traffic calls across the country – Geographic patterns in users’ mobility – Correlations between IP router time series data. • New challenges: heterogeneous capabilities of nodes; in telecommunications, most nodes have similar capabilities. Approaching the PIP: SpatioTemporal Mining of Sensor Data Promising Statistical Methods • Still in idealized setting: continuous sensor data collection. • Building on the Bayesian approach to modeling spatio-temporal data. Thomas Bayes 1702-1761 Approaching the PIP: SpatioTemporal Mining of Sensor Data Promising Statistical Methods • Bayesian approaches take advantage of recent dramatic advances in simulation technology (Markov chain Monte Carlo) • Limitations of existing methods: dependence on “batch analysis;” arrival of new data means start from scratch. Approaching the PIP: SpatioTemporal Mining of Sensor Data Promising Statistical Methods • There is need for “online” or “sequential” methods that update models as data comes in. • Relevance of recent work of Ridgeway and Madigan on sequential Monte Carlo methods using “particle filters” Approaching the PIP: SpatioTemporal Mining of Sensor Data Additional Promising Statistical Methods • Methods for visualizing the data will help human decision makers. • Methods of statistical process control are relevant to finding the most effective ways to aggregate data across sensors to detect anomalies. Approaching the PIP: Triggering Other Methods of Surveillance • One type of BT surveillance cannot be considered in isolation. • Relevant work in talks of Madigan/Rolka, Pagano, and Zelicoff • Question: How can the pattern of sensor warnings guide other biosurveilance methods? Approaching the PIP: Triggering Other Methods of Surveillance • Increased syndromic surveillance? • Change threshold for alarm in syndromic surveillance? • Increased attention to E.R. visits in a certain region? Approaching the PIP: Triggering Other Methods of Surveillance • Decreased threshold for alarm from subway worker absenteeism levels? Approaching the PIP: Triggering Other Methods of Surveillance • If there is an initial alarm, each sensor may be read more often. • How do we pick the sensors to read more frequently? • This is “adaptive biosensor engagement.” • Methods of bichromatic combinatorial optimization may be relevant. • As for the SLP, sensors get one color, sensor messages another. • Relevance of work of Muthukrishnan. There are Remarkably Many Challenges from this One Problem! Thanks to DIMACS SLP/PIP Team: • • • • • • • • Benjamin Avi-Itzhak Thomas Boucher Tamra Carpenter David Coit Elsayed Elsayed Panos Georgopoulos Mel Janowitz Paul Kantor •Howard Karloff •Jon Kettenring •Paul Lioy •David Madigan •S. Muthukrishnan •Rafail Ostrovsky •Michael Rothkopf •Yehuda Vardi Thanks also to: • • • • • • • Jeff Grotte, Institute for Defense Analyses Farzad Mostashari, NYC Dept. of Health Dennis Nash, NYC Dept.of Health Nathan Platt, Institute for Defense Analyses Al Rhodes, Defense Threat Reduction Agency Jay Spingarn, DefenseThreat Reduction Agency Fred Steinberg, MITRE Corp.