Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions Saket Anand, David Madigan, Richard Mammone, Fred Roberts Enumeration and Selection of Optimum Decision Tree A set of decision trees is constructed for each complete and monotonic boolean function where inputs represent tests performed by each sensor Y = f(A, B, C) where f is complete and monotonic The cost of each tree is evaluated and the optimum tree selected. A 0 B C Y 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 C A A 0 B A 1 C 1 1 B 0 0 1 1 Enumeration and Selection of Optimum Decision Tree The decision trees are constructed using 4 sensors For three sensors, there are 114 monotonic and complete boolean expressions. These can be implemented using 11808 distinct trees. The trees are evaluated and ranked using the cost function1. The tree with the lowest cost is selected as the optimum decision tree. 1Stroud, P. D. and Saeger K J., “Enumeration of Increasing Boolean Expressions and Alternative Digraph Implementations for Diagnostic Applications”, Proceedings Vol. IV, Computer, Communication and Control Technologies Cost Function used for evaluating the decision trees. CTot = CFalsePositive *PFalsePositive + CFalseNegative *PFalseNegative + Cfixed where, CFalsePositive is the cost of false positive (Type I error) CFalseNegative is the cost of false negative (Type II error) PFalsePositive is the probability of a false positive occurring PFalseNegative is the probability of a false negative occurring Cfixed is the fixed cost of utilization of the tree. The Error Probability of the entire tree is computed from the error probabilities of the individual sensors. Probability of Error for Individual Sensors For ith sensor, the type 1 (P(Yi=1|X=0)) and type 2 (P(Yi=0|X=1)) errors are modeled using Gaussian distributions. State of nature X=0 represents absence of a bomb. State of nature X=1 represents presence of a bomb. Yi represents the outcome of sensor i. It is characterized by: Ki, discrimination coefficient Ti, decision threshold Σi, variance of the distributions Ki P(Yi|X=0) Ti P(Yi|X=1) Characteristics of a typical sensor Receiver Operating Characteristic (ROC) Curve The ROC curve is the plot of the Probability of correct detection (PD) vs. the Probability of false positive (PF). The ROC curve is used to select an operating point, which provides the trade off between the PD and PF Each sensor has a ROC curve and the combination of the sensors into a decision tree has a composite ROC curve. The parameter which is varied to get different operating points on the ROC curve is the sensor Threshold and a combination of Thresholds for the decision tree. Equal Error Rate (EER) is the operating point on the ROC curve where, PF = 1 - PD 1 PD Operating Point EER 0 P(Yi|X=0) PF Ki 1 Ti P(Yi|X=1) Stroud-Saeger Experiments Stroud-Saeger ranked all trees formed from four given sensors A, B, C and D according to increasing tree costs. The cost function used was as shown in earlier slides. Values used in their experiment: CA = .25; KA = 4.37; ΣA = 1; CB = .25; KB = 1.53; ΣB = 1; CC = 10; KC = 2.9; ΣC = 1; CD = 30; KD = 4.6; ΣD = 1; where Ci is the individual cost of utilization of sensor i, Ki is the sensor discrimination power and Σi is the relative spread factor for sensor i. Values of other variables are not known. Cost Sensitivity to Global Parameters Values used in the experiment: CA = .25; P(YA=1|X=1) = .9856; P(YA=1|X=0) = .0144; CB = 1; P(YB=1|X=1) = .7779; P(YB=1|X=0) = .2221; CC = 10; P(YC=1|X=1) = .9265; P(YC=1|X=0) = .0735; CD = 30; P(YC=1|X=1) = .9893; P(YC=1|X=0) = .0107; where Ci is the individual cost of utilization of sensor i. The probabilities have been computed for a threshold corresponding to the equal error rate. CFalseNegative to be varied between 25 million and 500 billion dollars CFalsePositive to be varied between 180 and 720 dollars Low and high estimates of direct and indirect costs incurred due to a false negative. Cost incurred due to false positive (4 men * (3 -6 hrs) * (15 – 30 $/hr) P(X=1) to be varied between 3/109 and 1/100,000 Structure of trees which came first Rank with 3 sensors (A, C and D) a a c 0 b 1 c b 1 Tree number 49 Boolean Expr: 01010111 b 1 c 1 0 0 a 1 1 Tree number 55 Boolean Expr: 01111111 1 c 0 0 1 Tree number 37 Boolean Expr: 00011111 Frequency of optimal trees with 3 sensors (A,C and D) when one parameter was varied Constant Parameter(s) 1.281x10-6, P(X=1) = CFalsePositive = 492.61 P(X=1) = 0.8373x10-5, CFalseNegative = 4.2681x1011 CFalseNegative = 4.4747x1011, CFalsePositive = 351.9526 Variable Parameter(s) Tree Numbers Frequency (out of 10,000) Equivalent Boolean Expression 37 568 00011111 55 9432 01111111 55 9946 01111111 37 54 00011111 55 9946 01111111 CFalseNegative CFalsePositive P(X=1) Randomly selected fixed parameter values Variation of CTot vs. CFalseNegative P(X=1) and CFalsePositive were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalseNegative in the specified range. Randomly selected fixed parameter values Variation of CTot vs. CFalsePositive P(X=1) and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalsePositive in the specified range. Randomly selected fixed parameter values Variation of CTot vs. P(X=1) CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of P(X=1) in the specified range. Randomly selected fixed parameter values Frequency of optimal trees with 3 sensors (A,C and D) when one parameter was varied Constant Parameter(s) Variable Parameter(s) P(X=1) = 3x10-8, CFalsePositive=600 CFalseNegative P(X=1) = 3x10-8, CFalseNegative = 5x1010 CFalsePositive CFalseNegative = 5x1010, CFalsePositive = 600 P(X=1) Tree Numbers 37 Frequency (out of 10,000) 10000 Equivalent Boolean Expression 00011111 37 10000 00011111 49 108 01010111 37 694 00011111 55 9198 01111111 Fixed parameter values selected at Stroud and Saeger values Variation of CTot vs. CFalseNegative P(X=1) and CFalsePositive were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalseNegative in the specified range. Fixed parameter values selected at Stroud and Saeger values Variation of CTot vs. CFalsePositive P(X=1) and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalsePositive in the specified range. Fixed parameter values selected at Stroud and Saeger values Variation of CTot vs. P(X=1) CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of P(X=1) in the specified range. Fixed parameter values selected at Stroud and Saeger values Variation of CTot wrt CFalseNegative and CFalsePositive Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalseNegative and P(X=1) Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalsePositive and P(X=1) Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalseNegative and CFalsePositive Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalseNegative and P(X=1) Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalsePositive and P(X=1) Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Tree Structure and corresponding Boolean Expressions a a b c d 0 b 1 d 1 1 1 Tree number 11785 Boolean Expr: 0111111111111111 1 0 c 1 0 1 d 1 Tree number 11605 Boolean Expr: 0101011111111111 Tree Structure and corresponding Boolean Expressions a a b 1 c 0 d d 0 b 0 b c 1 1 Tree number 9133 Boolean Expr: 0001010111111111 0 d d 0 0 1 c 1 d 0 1 1 1 Tree number 8965 Boolean Expr: 0001010101111111 Tree Structure and corresponding Boolean Expressions a a c 0 d 0 b b c 1 d 0 b 1 1 1 Tree number 6797 Boolean Expr: 0001000101111111 c c 0 0 d d 0 1 0 1 1 1 Tree number 2473 Boolean Expr: 0000000101111111 Tree Structure and corresponding Boolean Expressions a d 0 b 1 c d 0 1 1 1 Tree number 11305 Boolean Expr: 0101010101111111 Variation of CTot vs. CFalseNegative P(X=1) and CFalsePositive were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalseNegative in the specified range. Randomly selected fixed parameter values Variation of CTot vs. CFalsePositive P(X=1) and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalsePositive in the specified range. Randomly selected fixed parameter values Variation of CTot vs. P(X=1) CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of P(X=1) in the specified range. Randomly selected fixed parameter values Variation of CTot vs. CFalseNegative P(X=1) and CFalsePositive were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalseNegative in the specified range. Fixed parameter values selected at Stroud and Saeger values Variation of CTot vs. CFalsePositive P(X=1) and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of CFalsePositive in the specified range. Fixed parameter values selected at Stroud and Saeger values Variation of CTot vs. P(X=1) CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was computed for 10,000 randomly selected values of P(X=1) in the specified range. Fixed parameter values selected at Stroud and Saeger values Frequency of optimal trees with 4 sensors when two parameters were varied. The fixed parameters were randomly selected. Constant Parameter(s) CFalsePositive = 453.6849 CFalseNegative = 4.7485x1010 P(X=1) = 0.6344x10-5 Variable Parameter(s) CFalseNegative P(X=1) P(X=1), CFalsePositive CFalseNegative, CFalsePositive Tree Numbers Frequency (out of 10,000) Equivalent Boolean Expression 505 1 0000000001111111 6797 18 0001000101111111 8965 50 0001010101111111 9001 7 0001010101111111 9017 6 0001010101111111 9133 235 0001010111111111 11605 8621 0101011111111111 11785 1062 0111111111111111 2617 1 0000000111111111 6797 16 0001000101111111 8965 121 0001010101111111 9001 7 0001010101111111 9017 13 0001010101111111 9133 392 0001010111111111 11305 99 0101010101111111 11605 9351 0101011111111111 6797 2 0001000101111111 8965 13 0001010101111111 9133 65 0001010111111111 11305 13 0101010101111111 11605 7928 0101011111111111 11785 1979 0111111111111111 Randomly selected fixed parameter values Variation of CTot wrt CFalseNegative and CFalsePositive Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalseNegative and P(X=1) Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalsePositive and P(X=1) Randomly selected fixed parameter values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Frequency of optimal trees with 4 sensors when two parameters were varied. The fixed parameters were selected at the Stroud and Saeger values. Constant Parameter(s) CFalsePositive=600 CFalseNegative = 5x1010 P(X=1) = 3x10-8, Variable Parameter( s) CFalseNegative P(X=1) P(X=1), CFalsePositive CFalseNegative, CFalsePositive Tree Numbers Frequency (out of 10,000) Equivalent Boolean Expression 505 1 0000000001111111 2473 2 0000000101111111 2509 1 0000000101111111 6797 18 0001000101111111 8965 138 0001010101111111 9001 19 0001010101111111 9017 7 0001010101111111 9133 184 0001010111111111 11305 65 0101010101111111 11605 9232 0101011111111111 11785 333 0111111111111111 6797 14 0001000101111111 8965 117 0001010101111111 9001 9 0001010101111111 9017 9 0001010101111111 9133 374 0001010111111111 11305 96 0101010101111111 11605 9381 0101011111111111 505 11 0000000001111111 775 5 0000000100001111 2473 42 0000000101111111 2617 40 0000000111111111 6797 558 0001000101111111 8965 3833 0001010101111111 9133 5406 0001010111111111 11605 105 0101011111111111 Variation of CTot wrt CFalseNegative and CFalsePositive Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalseNegative and P(X=1) Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Variation of CTot wrt CFalsePositive and P(X=1) Fixed parameter values selected at Stroud and Saeger values CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed Sensitivity to Sensor Performance Following experiments have been done using sensors A, B, C and D as described below by varying the individual sensor thresholds TA, TB and TC from -4.0 to +4.0 in steps of 0.4. These values were chosen since they gave us a ROC curve for the individual sensors over a complete range P(Yi=1|X=0) and P(Yi=1|X=1) CA = .25; KA = 4.37; ΣA = 1 CB= .25; KB = 1.53; ΣB = 1 CC = 15; KC = 2.9; ΣC = 1 CD = 30; KD = 4.6; ΣD = 1 where Ci is the individual cost of utilization of sensor i, Ki is the discrimination power of the sensor and Σi is the spread factor for the sensor The probability of false positive for the ith sensor is computed as: P(Yi=1|X=0) = 0.5 erfc[Ti/√2] The probability of detection for the ith sensor is computed as: P(Yi=1|X=1) = 0.5 erfc[(Ti-Ki)/(Σ√2)] Frequency of optimal trees with 3 sensors when the Thresholds were varied. The fixed parameters ( CFalsePositive, CFalseNegative , P(X=1)) were selected randomly. Fifteen trees attained rank one, out of which tree number 37 was the most frequent. Constants CFalseNegative = = 5.0125x109 P(X=1) = 5.05x10-6 and CFalsePositive = 450 Tree Numbers Frequency Boolean Expression 27 114 00010111 29 146 00010111 2 183 00000001 49 264 01010111 51 322 01010111 25 957 00010111 23 1475 00010101 15 2437 00010011 38 4572 00011111 19 5256 00010101 1 5828 00000001 45 5873 00110111 55 10587 01111111 7 13392 00000111 37 17515 00011111 Performance (ROC) of Best Decision Tree for Tree number 37 Performance (ROC) of Best Decision Tree for Tree number 37 Frequency of optimal trees with 4 sensors when the Thresholds were varied. The fixed parameters ( CFalsePositive, CFalseNegative , P(X=1)) were selected randomly. 244 trees attained rank one, out of which tree number 445 was the most frequent. Only 15 most frequently occurring optimal trees out of the 241 are tabulated below. Constants CFalseNegative == 4.8668x1011 P(X=1) = 7.5361x10-6 and CFalsePositive = 499.75 Tree Numbers Frequency Boolean Expression 445 13012 0000000001010111 145 11143 0000000000010101 11605 10958 0101011111111111 505 10545 0000000001111111 2617 10139 0000000111111111 9133 9280 0001010111111111 5761 5942 0000011111111111 11785 5910 0111111111111111 325 5574 0000000000011111 506 5249 0000000001111111 11791 5196 0111111111111111 8003 4539 0001001111111111 10783 4496 0001111111111111 386 4018 0000000000110111 87 3402 0000000000010011 Performance (ROC) of Best Decision Tree for tree number 445 Performance (ROC) of Best Decision Tree for tree number 445 Cost Sensitivity to Global Parameters Values used in the experiment: CA = .25; P(YA=1|X=1) = .9856; P(YA=1|X=0) = .0144; CB = 1; P(YB=1|X=1) = .7779; P(YB=1|X=0) = .2221; CC = 10; P(YC=1|X=1) = .9265; P(YC=1|X=0) = .0735; CD = 30; P(YC=1|X=1) = .9893; P(YC=1|X=0) = .0107; where Ci is the individual cost of utilization of sensor i. The probabilities have been computed for a threshold corresponding to the equal error rate. CFalseNegative to be varied between 25 million and 500 billion dollars CFalsePositive to be varied between 180 and 720 dollars Low and high estimates of direct and indirect costs incurred due to a false negative. Cost incurred due to false positive (4 men * (3 -6 hrs) * (15 – 30 $/hr) P(X=1) to be varied between 3/109 and 1/100,000 Costs based BDT For each node i: Calculate Cterm at that node based upon (Pi x Cfn > or < Ni x Cfp) For each sensor s Vary threshold from -4 to 4 in steps of 0.1 Calculate the least expensive sensor cost Cs for node i If Cterm > Cs then introduce sensor s at node i else terminate the branch Total runs = 10000 Count the frequency of optimum trees Structure of most frequent trees with 4 sensors Performance of Cost Based BDT