Quantifying Unobserved Attributes in Expert Elicitation of Terrorist Preferences Vicki Bier, Chen Wang University of Wisconsin-Madison Research Goals • To construct a reasonable defender prior distribution over possible terrorist preferences: – By explicitly modeling the defender’s uncertainty about unobserved attributes – I.e., attributes that may be important to the terrorist, but are un-quantified or unobserved by the defender • To simplify the task of quantifying threat probabilities for subject-matter experts: – By using ordinal rather than cardinal estimates – To increase the acceptance of quantitative approaches in the intelligence community Indirect Expert Elicitation • Allows experts to express their knowledge as rank orderings rather than numerical values: – Simplifies the process of bringing expert knowledge to bear • Well suited to elicitation challenges commonly encountered in the intelligence community: – High uncertainty and sparse data – Reluctance on the part of experts to express their knowledge in probabilistic form Mathematical Approach • Experts provide rank orderings of selected targets or attack strategies: – Reflecting their knowledge of adversary preferences • Adversary preferences are assumed to follow an additive multi-attribute utility function: – With an “error term” representing the effect of any attributes that have not been identified or observed by the defender – Assumed to be independent and identically distributed! • Probabilistic inversion is used to estimate the values of both attribute weights and unobserved attributes: – To yield the best fit to the stated rank orderings – Taking into account expert consensus or disagreement Attribute Values for the Case Study Urban Area New York, NY Chicago San Francisco Washington, DC-MD-VA-WV Los Angeles-Long Beach Philadelphia, PA-NJ Boston, MA-NH Houston Newark, NJ Seattle-Bellevue-Everett Jersey City Detroit Las Vegas, NV-AZ Oakland, CA Orange County, CA Cleveland-Lorain-Elyria San Diego Miami, FL Minneapolis-St. Paul, MN-WI Denver Property Loss ($ million), X1 413 115 57 36 34 21 18 11 7.3 6.7 4.4 4.2 4.1 4 3.7 3 2.8 2.7 2.7 2.5 Fatalities, X2 304 54 24 29 17 9 12 9 4 4 2 1.9 1 1 2 0.5 1 0.5 0.4 1.1 Population, X3 9,314,235 8,272,768 1,731,183 4,923,153 9,519,338 5,100,931 3,406,829 4,177,646 2,032,989 2,414,616 608,975 4,441,551 1,563,282 2,392,557 2,846,289 2,250,871 2,813,833 2,253,362 2,968,806 2,109,282 Population Density (per sq mile), X4 8,159 1,634 1,705 756 2,344 1,323 1,685 706 1,289 546 13,044 1,140 40 1,642 3,606 832 670 1,158 490 561 3 Hypothetical Expert Groups • Each group has 10 experts: – Each of whom ranks the top 10 targets • Group 1: – All find X1 (property loss) and X2 (fatalities) important, but not X4 (population density) – Little or no weight on unobserved attributes • Group 2: – All think X4 (population density) is important – Opinions reflect an unobserved attribute, corresponding to presence of entertainment industry • Group 3 (expert disagreement): – Five experts from Group 1, and five from Group 2 Modeling Unobserved Attributes • Attempt to fit the attribute weights to the stated rank orderings • Use trial and error to find the weight for the unobserved attribute that yields the lowest infeasibility • Resulting weights: 4.5 4 Relative Information – Group 1: 0.02 – Group 2: 0.09 (largest) – Group 3: 0.08 5 3.5 3 2.5 Group 1 2 Group 2 1.5 Group 3 1 0.5 0 0 0.05 0.1 0.15 Weight of unobserved attribute 0.2 Results of Probabilistic Inversion Group 1 Group 2 Group 3 E[X1] E[X4] X5 (Property E[X2] E[X3] (Population Unobserved Loss) (Fatalities) (Population) Density) Attributes 0.367 0.552 0.023 0.038 0.02 0.210 0.265 0.090 0.345 0.09 0.325 0.366 0.113 0.117 0.08 Relative Information (Infeasibility) 0.272 1.067 0.016 • Group 1 yields high weights on X1 and X2: – Low weight on X4 • Group 2 yields the highest weight on X4: – With Group 3 intermediate between Groups 1 and 2 • As expected, Group 2 has the largest infeasibility: – Since the experts in Group 2 take unobserved attributes into account, even the best fit performs worse than the other two groups 5000 1000 0 0 x_3 x_4 4000 0 2000 Frequency 6000 12000 8000 0 4000 Frequency 0 1000 3000 Frequency 5000 0.0 0.2 0.4 0.6 0.8 1.0 8000 x_2 5000 x_1 3000 15000 Frequency 5000 3000 Frequency 2500 1500 0 500 Frequency 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x_1 x_2 x_3 x_4 0.0 0.2 0.4 0.6 0.8 1.0 10000 0 2000 0 0.0 0.2 0.4 0.6 0.8 1.0 6000 Frequency 2000 6000 Frequency 3000 1000 0 Inconsistent judgments – higher variance x_4 0.0 0.2 0.4 0.6 0.8 1.0 Frequency Group 3 x_3 0.0 0.2 0.4 0.6 0.8 1.0 0 1000 X4 increases x_2 5000 Group 2 3000 X1 and X2 increase x_1 0 1000 Group 1 1000 2000 3000 4000 Uniform prior Probabilistic inversion 3500 Distributions of Attribute Weights 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Unobserved Attributes • Can look at the posterior distributions for the unobserved attribute: Uniform prior – To identify candidate unobserved attribute(s) epsilon_LV 0 3000 0 1000 1000 Frequency 3000 Frequency 5000 5000 epsilon_Jersey 5000 3000 0 1000 Frequency 4000 6000 epsilon_LA 0 LA, Jersey City, and Las Vegas increase 2000 Group 2 epsilon_NYC Probabilistic inversion • Posterior correlations for unobserved attributes: NYC DC LA Las Vegas NYC 1 DC -0.201 1 LA 0.034 -0.011 1 Las Vegas 0.045 0.014 0.161 1 – Positive correlation between Los Angeles and Las Vegas suggests that some experts consider presence of an entertainment industry important Predicted Rankings With Unobserved Attributes Red – Increased Blue – decreased Compared to case without unobserved attributes Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Group 1 NYC Chicago San Francisco DC LA Boston Philadelphia Houston Jersey City Newark Seattle Orange County Detroit Oakland San Diego Miami Denver Minneapolis Cleveland Las Vegas Group 2 NYC Jersey City Chicago LA Orange County San Francisco DC Boston Philadelphia Detroit Houston Oakland Newark Miami Las Vegas Seattle Cleveland Minneapolis San Diego Denver Group 3 NYC Chicago LA DC San Francisco Jersey City Boston Philadelphia Houston Orange County Newark Oakland Detroit Miami Seattle Minneapolis San Diego Las Vegas Cleveland Denver Changes by more than 1 place are colored. Assessment of Results • Predicted rankings are more consistent with expert judgments when unobserved attributes are included, especially for Group 2: – Las Vegas gets high rankings due to unobserved attribute (presence of entertainment industry) – The model without unobserved attributes does not have the flexibility to adequately reflect expert judgments • Can also be used as a basis for inference about what unobserved attributes • For example, if LA and Las Vegas are rated higher than their known attribute values would suggest: – That might indicate the need to include presence of a large entertainment industry as a terrorist attribute Assessment of Results • Results may be better than direct weight elicitation • For example, some experts may put high weight on population density: – Without realizing this implies a high ranking for Jersey City • Can deal with conflicting and/or inconsistent expert opinions: – By (possibly multi-modal) distributions of attribute weights with high variance Pre-Posterior Analysis • How do the models perform with Bayesian updating: – Especially after an unexpected attack? • Problem: – Some targets have zero probability of being attacked in the model – Model would break in the event of an attack on such a target – Cannot condition on a set of measure zero! • Model without unobserved attributes is especially poor in this respect • May need to consider non-uniform (e.g., U-shaped) prior distributions for the unobserved attributes Bayesian Updating • Consider a target with a positive probability of being attacked: – Assume an (unexpected) attack on Jersey City • Probability that the next attack is also on Jersey City becomes quite high (maybe unrealistically high) • What happens to the attribute weights? 0 50 Frequency 100 150 200 150 0 50 100 Frequency 600 Frequency 0 200 0 0 200 600 Frequency 1000 2000 500 1000 Frequency 1000 x_4 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x_1 x_2 x_3 x_4 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 400 200 0 Frequency 600 600 400 0 200 Frequency 0 200 600 Frequency 1000 1000 0.0 0.2 0.4 0.6 0.8 1.0 200 600 150 50 0 0.0 0.2 0.4 0.6 0.8 1.0 x_3 0 Group 3 100 Frequency 150 0 0.0 0.2 0.4 0.6 0.8 1.0 x_2 2000 0.0 0.2 0.4 0.6 0.8 1.0 x_1 0 Group 2 x_4 0.0 0.2 0.4 0.6 0.8 1.0 500 1000 X1 and X2 decrease; X4 increases x_3 200 x_2 50 Group 1 x_1 100 Prior mean Posterior mean 200 An Attack on Jersey City 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Posterior of Unobserved Attribute 8 6 7 epsilon_Las Vegas 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 4 1.0 0.0 0.4 0.6 0.8 epsilon_Las Vegas 1.0 100 0 100 0 0.6 0.8 0.0 0.2 0.4 0.6 0.8 epsilon_Las Vegas 1.0 Frequency 300 400 1.0 400 epsilon_Jersey 100 0 0 100 200 Frequency 200 100 0.2 400 300 Frequency 400 500 0.4 epsilon_Jersey 200 Frequency 400 300 0 300 400 epsilon_NYC 0 Group 3 0.2 500 0.2 0.0 300 0.0 500 1.0 200 0.8 300 0.6 200 0.4 epsilon_NYC 0 0 0 0.2 100 Group 2 0.0 200 Jersey City has higher values on the unobserved attribute 1 1 2 2 3 4 Frequency 5 6 4 Frequency Group 1 2 3 Prior mean Posterior mean epsilon_Jersey 5 epsilon_NYC 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Predicted Rankings after an Attack on Jersey City Red – increased Blue – decreased Jersey City ranks higher, but not the highest! This seems reasonable Groups 1 and 3 still consider NYC more attractive Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Group 1 NYC Jersey City Chicago LA San Francisco Orange DC Boston Philadelphia Oakland Detroit Newark Houston Miami Cleveland San Diego Seattle Minneapollis Denver LV Group 2 Jersey City NYC Orange LA Chicago Boston San Francisco Philadelphia Oakland Detroit DC Newark Miami Houston Cleveland San Diego Seattle Minneapollis Denver LV Group 3 NYC Jersey City Orange LA Chicago Boston San Francisco Philadelphia Oakland Detroit DC Newark Miami Houston Cleveland San Diego Seattle Minneapollis Denver LV Changes by more than 2 places are colored. 0.0 0.4 0.8 0.0 0.4 0.8 3000 0 2000 2000 x_2 x_2 0.0 0.0 0.4 0.4 x_3 0.4 0.8 500 1000 3000 0.0 4000 0.8 Frequency 1000 1500 200 0 0 600 4000 Frequency 600 800 1000 6000 1000 1000 Frequency 600 Frequency 400 2000 200 0 200 0 1400 x_2 2000 0.4 1000 0.0 0.8 0 0 1000 x_1 0.4 Frequency 2500 0.8 Frequency 500 1500 0.0 1500 0 Frequency 1000 0.8 Frequency x_1 2000 0.4 1500 0.0 0.4 1000 500 0.0 500 500 Frequency 0 x_1 0 0 Group 3 1500 Group 2 1000 Group 1 500 Prior mean Posterior mean 0 An Attack on New York City x_3 x_4 0.8 0.8 No significant changes 0.0 x_3 0.0 0.0 0.4 0.4 0.4 0.8 x_4 x_4 0.8 0.8 Future Directions • An alternative approach for fitting expert opinions: – Bayesian density estimation • Sensitivity analysis on the performance of the two approaches: – Computational behavior (convergence properties, run times) – Reasonableness of predicted rankings – Performance with Bayesian updating after expected and unexpected attacks • Obtain stakeholder feedback on applicability of methodology and realism of results