Decision Support Algorithms for Port of Entry Inspection Fred S. Roberts DIMACS Center, Rutgers University In collaboration with Los Alamos National Laboratory Preliminary Support: Office of Naval Research 1 Port of Entry Inspection Algorithms •Goal: Find ways to intercept illicit nuclear materials and weapons destined for the U.S. via the maritime transportation system •Currently inspecting only small % of containers arriving at ports •Even inspecting 8% of containers in Port of NY/NJ might bring international trade to a halt (Larrabbee 2002) 2 Port of Entry Inspection Algorithms •Aim: Develop decision support algorithms that will help us to “optimally” intercept illicit materials and weapons subject to limits on delays, manpower, and equipment •Find inspection schemes that minimize total “cost” including “cost” of false positives and false negatives Mobile Vacis: truckmounted gamma ray imaging system 3 Sequential Decision Making Problem •Stream of containers arrives at a port •The Decision Maker’s Problem: •Which to inspect? •Which inspections next based on previous results? •Approach: –“decision logics” –combinatorial optimization methods –Builds on ideas of Stroud –and Saeger at LANL –Need for new models – and methods 4 Sequential Decision Making Problem •Containers arriving to be classified into categories. •Simple case: 0 = “ok”, 1 = “suspicious” •Inspection scheme: specifies which inspections are to be made based on previous observations 5 Sequential Decision Making Problem •Containers have attributes, each in a number of states •Sample attributes: –Does ship’s manifest set off an “alarm”? –What is the neutron or Gamma emission count? Is it above threshold? –Does a radiograph image come up positive? –Does an induced fission test come up positive? Gamma ray detector 6 Sequential Decision Making Problem •Simplest Case: Attributes are in state 0 or 1 •Then: Container is a binary string like 011001 •So: Classification is a decision function F that assigns each binary string to a category. 011001 F(011001) If attributes 2, 3, and 6 are present, assign container to category F(011001). 7 Sequential Decision Making Problem •If there are two categories, 0 and 1, decision function F is a boolean function. Example: F(000) = F(111) = 1, F(abc) = 0 otherwise This classifies a container as positive iff it has none of the attributes or all of them. 1= 8 Sequential Decision Making Problem •Given a container, test its attributes until know enough to calculate the value of F. •An inspection scheme tells us in which order to test the attributes to minimize cost. Hard Questions •Even this simplified problem is hard computationally. 9 Binary Decision Tree Approach •Sensors measure presence/absence of attributes. •Binary Decision Tree: –Nodes are sensors or categories (0 or 1) –Two arcs exit from each sensor node, labeled left and right. –Take the right arc when sensor says the attribute is present, left arc otherwise 10 Binary Decision Tree Approach •Reach category 1 from the root only through the path a0 to a1 to 1. •Container is classified in category 1 iff it has both attributes a0 and a1. •Corresponding boolean function F(11) = 1, F(10) = F(01) = F(00) = 0. Figure 1 11 Binary Decision Tree Approach •Reach category 1 from the root by: a0 L to a1 R a2 R 1 or a0 R a2 R1 •Container classified in category 1 iff it has a1 and a2 and not a0 or a0 and a2 and possibly a1. •Corresponding boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise. Figure 2 12 Binary Decision Tree Approach •This binary decision tree corresponds to the same boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise. However, it has one less observation node ai. So, it is more efficient if all observations are equally costly and equally likely. Figure 3 13 Binary Decision Tree Approach •Even if the boolean function F is fixed, the problem of finding the “optimal” binary decision tree for it is very hard (NP-complete). •For small n = number of attributes, can try to solve it by brute force enumeration. Port of Long Beach •Even for n = 4, not practical. (n = 4 at Port of Long Beach-Los Angeles) 14 Binary Decision Tree Approach Promising Approaches: •Heuristic algorithms, approximations to optimal. •Special assumptions about the boolean function F. •Example: For “monotone” boolean functions, integer programming formulations give promising heuristics. 15 Cost Functions •Above analysis: Only uses number of sensors •Using a sensor has a cost: –Unit cost of inspecting one item with it –Fixed cost of purchasing and deploying it –Delay cost from queuing up at the sensor station •Unit Cost Complication: How many nodes of the decision tree are actually visited during average container’s inspection? Depends on “distribution” of containers. 16 Cost Functions: Delay Costs •Stochastic process of containers arriving •Distribution of delay times for inspections •Use queuing theory to find average delay times under different models 17 Cost Functions •Cost of false positive: Cost of additional tests. –If it means opening the container, it’s very expensive. •Cost of false negative: Complex issue. 18 Cost Functions •One Approach to False Positives/Negatives and Sensor Errors: Modeling Sensor Operation •Threshold Model: •Sensors have different discriminating power •Many use counts •See if count exceeds threshold 19 Cost Functions Threshold Model: •Sensor discriminating power K, threshold T •Attribute present if counts exceed T •Calculate fraction of objects in each category whose readings exceed T •Seek threshold values that minimize all costs: inspection, false positive/negative •Simulation approach Mathematical modeling 20 Complications •Sensor errors – probabilistic approach •More than two values of an attribute (present, absent, present with probability > 75%, etc.) •Inferring the boolean function from observations (partially defined boolean functions) 21 Complications Machine learning approaches are promising: –Bayesian binary regression –Splitting strategies –Pruning learned decision trees 22 Research Team • • • • • • • • • • • • • • Endre Boros, Rutgers, Operations Research Elsayed Elsayed, Rutgers, Industrial and Systems Engineering Paul Kantor, Rutgers, School of Information and Library Studies Sallie Keller-McNulty, Los Alamos, Statistical Sciences Group Alex Kogan, Rutgers, Business School Paul Lioy, Rutgers/UMDNJ, Environmental and Occupational Health and Sciences Institute David Madigan, Rutgers, Statistics Richard Mammone, Rutgers, Center for Advanced Information Processing S. Muthukrishnan, Rutgers, Computer Science Feng Pan, Los Alamos, Energy and Infrastructure Analysis Group Richard Picard, Los Alamos, Statistical Sciences Group Fred Roberts, Rutgers, DIMACS Center Kevin Saeger, Los Alamos, Homeland Security Phillip Stroud, Los Alamos, Systems Engineering and Integration 23 Group For Further Information Fred Roberts •Director of DIMACS http://dimacs.rutgers.edu/ •Chair, Rutgers University Homeland Security Research Initiative http://dimacs.rutgers.edu/RUHSRI/ •Co-chair, New Jersey Universities Homeland Security Research Consortium http://dimacs.rutgers.edu/NJHSConsortium/ 24