Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions

advertisement
Sensitivity Analysis of Enumerated Trees
of Increasing Boolean Expressions
Saket Anand, David Madigan,
Richard Mammone, Fred Roberts
Enumeration and Selection of
Optimum Decision Tree
 A set of decision trees is
constructed for each complete
and monotonic boolean function
where inputs represent tests
performed by each sensor
Y = f(A, B, C) where f is
complete and monotonic
 The cost of each tree is
evaluated and the optimum
tree selected.
A
0
B
C
Y
0
0
0
0
0
0
1
0
0
1
0
0
0
1
1
0
1
0
0
0
1
0
1
1
1
1
0
1
1
1
1
1
C
A
A
0
B
A
1
C
1
1
B
0
0
1
1
Enumeration and Selection of
Optimum Decision Tree
 The decision trees are constructed using 4
sensors
 For three sensors, there are 114 monotonic and
complete boolean expressions. These can be
implemented using 11808 distinct trees.
 The trees are evaluated and ranked using the
cost function1.
 The tree with the lowest cost is selected as the
optimum decision tree.
1Stroud,
P. D. and Saeger K J., “Enumeration of Increasing Boolean Expressions and Alternative
Digraph Implementations for Diagnostic Applications”, Proceedings Vol. IV, Computer, Communication
and Control Technologies
Cost Function used for evaluating the
decision trees.
CTot = CFalsePositive *PFalsePositive + CFalseNegative *PFalseNegative + Cfixed
where,
CFalsePositive is the cost of false positive (Type I error)
CFalseNegative is the cost of false negative (Type II error)
PFalsePositive is the probability of a false positive occurring
PFalseNegative is the probability of a false negative occurring
Cfixed is the fixed cost of utilization of the tree.
The Error Probability of the entire tree is computed from the error
probabilities of the individual sensors.
Probability of Error for Individual
Sensors
 For ith sensor, the type 1 (P(Yi=1|X=0)) and type 2 (P(Yi=0|X=1)) errors
are modeled using Gaussian distributions.



State of nature X=0 represents absence of a bomb.
State of nature X=1 represents presence of a bomb.
Yi represents the outcome of sensor i.
 It is characterized by:
 Ki, discrimination coefficient
 Ti, decision threshold
 Σi, variance of the distributions
Ki
P(Yi|X=0)
Ti
P(Yi|X=1)
Characteristics of a typical sensor
Receiver Operating Characteristic
(ROC) Curve
 The ROC curve is the plot of the




Probability of correct detection (PD) vs.
the Probability of false positive (PF).
The ROC curve is used to select an
operating point, which provides the trade
off between the PD and PF
Each sensor has a ROC curve and the
combination of the sensors into a
decision tree has a composite ROC
curve.
The parameter which is varied to get
different operating points on the ROC
curve is the sensor Threshold and a
combination of Thresholds for the
decision tree.
Equal Error Rate (EER) is the operating
point on the ROC curve where,
PF = 1 - PD
1
PD
Operating Point
EER
0
P(Yi|X=0)
PF
Ki
1
Ti
P(Yi|X=1)
Stroud-Saeger Experiments
 Stroud-Saeger ranked all trees formed from four given sensors
A, B, C and D according to increasing tree costs. The cost
function used was as shown in earlier slides.
 Values used in their experiment:
 CA = .25; KA = 4.37; ΣA = 1;
 CB = .25; KB = 1.53; ΣB = 1;
 CC = 10; KC = 2.9; ΣC = 1;
 CD = 30; KD = 4.6; ΣD = 1;

where Ci is the individual cost of utilization of sensor i, Ki is the
sensor discrimination power and Σi is the relative spread factor
for sensor i.
 Values of other variables are not known.
Cost Sensitivity to Global Parameters
 Values used in the experiment:




CA = .25; P(YA=1|X=1) = .9856; P(YA=1|X=0) = .0144;
CB = 1; P(YB=1|X=1) = .7779; P(YB=1|X=0) = .2221;
CC = 10; P(YC=1|X=1) = .9265; P(YC=1|X=0) = .0735;
CD = 30; P(YC=1|X=1) = .9893; P(YC=1|X=0) = .0107;
where Ci is the individual cost of utilization of sensor i. The
probabilities have been computed for a threshold corresponding
to the equal error rate.

CFalseNegative to be varied between 25 million and 500 billion
dollars


CFalsePositive to be varied between 180 and 720 dollars


Low and high estimates of direct and indirect costs
incurred due to a false negative.
Cost incurred due to false positive
(4 men * (3 -6 hrs) * (15 – 30 $/hr)
P(X=1) to be varied between 3/109 and 1/100,000
Structure of trees which came first
Rank with 3 sensors (A, C and D)
a
a
c
0
b
1
c
b
1
Tree number 49
Boolean Expr: 01010111
b
1
c
1
0
0
a
1
1
Tree number 55
Boolean Expr: 01111111
1
c
0
0
1
Tree number 37
Boolean Expr: 00011111
Frequency of optimal trees with 3 sensors (A,C
and D) when one parameter was varied
Constant Parameter(s)
1.281x10-6,
P(X=1) =
CFalsePositive = 492.61
P(X=1) = 0.8373x10-5,
CFalseNegative = 4.2681x1011
CFalseNegative = 4.4747x1011, CFalsePositive =
351.9526
Variable
Parameter(s)
Tree Numbers
Frequency (out
of 10,000)
Equivalent Boolean
Expression
37
568
00011111
55
9432
01111111
55
9946
01111111
37
54
00011111
55
9946
01111111
CFalseNegative
CFalsePositive
P(X=1)
 Randomly selected fixed parameter values
Variation of CTot vs. CFalseNegative
P(X=1) and CFalsePositive were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalseNegative in the specified range.
Randomly selected fixed parameter values
Variation of CTot vs. CFalsePositive
P(X=1) and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalsePositive in the specified range.
Randomly selected fixed parameter values
Variation of CTot vs. P(X=1)
CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of P(X=1) in the specified range.
Randomly selected fixed parameter values
Frequency of optimal trees with 3 sensors (A,C
and D) when one parameter was varied
Constant Parameter(s)
Variable
Parameter(s)
P(X=1) = 3x10-8, CFalsePositive=600
CFalseNegative
P(X=1) = 3x10-8, CFalseNegative = 5x1010
CFalsePositive
CFalseNegative = 5x1010, CFalsePositive = 600
P(X=1)
Tree Numbers
37
Frequency (out of 10,000)
10000
Equivalent Boolean
Expression
00011111
37
10000
00011111
49
108
01010111
37
694
00011111
55
9198
01111111
 Fixed parameter values selected at Stroud and Saeger values
Variation of CTot vs. CFalseNegative
P(X=1) and CFalsePositive were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalseNegative in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Variation of CTot vs. CFalsePositive
P(X=1) and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalsePositive in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Variation of CTot vs. P(X=1)
CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of P(X=1) in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Variation of CTot wrt CFalseNegative and
CFalsePositive
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalseNegative and P(X=1)
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalsePositive and P(X=1)
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalseNegative and
CFalsePositive
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalseNegative and P(X=1)
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalsePositive and P(X=1)
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Tree Structure and corresponding Boolean
Expressions
a
a
b
c
d
0
b
1
d
1
1
1
Tree number 11785
Boolean Expr: 0111111111111111
1
0
c
1
0
1
d
1
Tree number 11605
Boolean Expr: 0101011111111111
Tree Structure and corresponding Boolean
Expressions
a
a
b
1
c
0
d
d
0
b
0
b
c
1
1
Tree number 9133
Boolean Expr: 0001010111111111
0
d
d
0
0
1
c
1 d
0
1
1
1
Tree number 8965
Boolean Expr: 0001010101111111
Tree Structure and corresponding Boolean
Expressions
a
a
c
0
d
0
b
b
c
1
d
0
b
1
1
1
Tree number 6797
Boolean Expr: 0001000101111111
c
c
0
0
d
d
0
1
0
1
1
1
Tree number 2473
Boolean Expr: 0000000101111111
Tree Structure and corresponding Boolean
Expressions
a
d
0
b
1
c
d
0
1
1
1
Tree number 11305
Boolean Expr: 0101010101111111
Variation of CTot vs. CFalseNegative
P(X=1) and CFalsePositive were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalseNegative in the specified range.
Randomly selected fixed parameter values
Variation of CTot vs. CFalsePositive
P(X=1) and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalsePositive in the specified range.
Randomly selected fixed parameter values
Variation of CTot vs. P(X=1)
CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of P(X=1) in the specified range.
Randomly selected fixed parameter values
Variation of CTot vs. CFalseNegative
P(X=1) and CFalsePositive were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalseNegative in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Variation of CTot vs. CFalsePositive
P(X=1) and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of CFalsePositive in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Variation of CTot vs. P(X=1)
CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was
computed for 10,000 randomly selected values of P(X=1) in the specified range.
Fixed parameter values selected at Stroud and Saeger values
Frequency of optimal trees with 4 sensors when
two parameters were varied. The fixed
parameters were randomly selected.
Constant Parameter(s)
CFalsePositive = 453.6849
CFalseNegative = 4.7485x1010
P(X=1) = 0.6344x10-5
Variable Parameter(s)
CFalseNegative
P(X=1)
P(X=1),
CFalsePositive
CFalseNegative,
CFalsePositive
Tree Numbers
Frequency (out of
10,000)
Equivalent Boolean
Expression
505
1
0000000001111111
6797
18
0001000101111111
8965
50
0001010101111111
9001
7
0001010101111111
9017
6
0001010101111111
9133
235
0001010111111111
11605
8621
0101011111111111
11785
1062
0111111111111111
2617
1
0000000111111111
6797
16
0001000101111111
8965
121
0001010101111111
9001
7
0001010101111111
9017
13
0001010101111111
9133
392
0001010111111111
11305
99
0101010101111111
11605
9351
0101011111111111
6797
2
0001000101111111
8965
13
0001010101111111
9133
65
0001010111111111
11305
13
0101010101111111
11605
7928
0101011111111111
11785
1979
0111111111111111
 Randomly selected fixed parameter values
Variation of CTot wrt CFalseNegative and
CFalsePositive
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalseNegative and P(X=1)
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalsePositive and P(X=1)
Randomly selected fixed parameter values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Frequency of optimal trees with 4 sensors when
two parameters were varied. The fixed
parameters were selected at the Stroud and
Saeger values.
Constant Parameter(s)
CFalsePositive=600
CFalseNegative =
5x1010
P(X=1) = 3x10-8,
Variable
Parameter(
s)
CFalseNegative
P(X=1)
P(X=1),
CFalsePositive
CFalseNegative,
CFalsePositive
Tree Numbers
Frequency (out of
10,000)
Equivalent Boolean
Expression
505
1
0000000001111111
2473
2
0000000101111111
2509
1
0000000101111111
6797
18
0001000101111111
8965
138
0001010101111111
9001
19
0001010101111111
9017
7
0001010101111111
9133
184
0001010111111111
11305
65
0101010101111111
11605
9232
0101011111111111
11785
333
0111111111111111
6797
14
0001000101111111
8965
117
0001010101111111
9001
9
0001010101111111
9017
9
0001010101111111
9133
374
0001010111111111
11305
96
0101010101111111
11605
9381
0101011111111111
505
11
0000000001111111
775
5
0000000100001111
2473
42
0000000101111111
2617
40
0000000111111111
6797
558
0001000101111111
8965
3833
0001010101111111
9133
5406
0001010111111111
11605
105
0101011111111111
Variation of CTot wrt CFalseNegative and
CFalsePositive
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalseNegative and P(X=1)
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Variation of CTot wrt CFalsePositive and P(X=1)
Fixed parameter values selected at Stroud and Saeger values
CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cfixed
Sensitivity to Sensor Performance
Following experiments have been done using sensors A, B, C and D
as described below by varying the individual sensor thresholds TA, TB
and TC from -4.0 to +4.0 in steps of 0.4. These values were chosen
since they gave us a ROC curve for the individual sensors over a
complete range P(Yi=1|X=0) and P(Yi=1|X=1)
CA = .25; KA = 4.37; ΣA = 1
CB= .25; KB = 1.53; ΣB = 1
CC = 15; KC = 2.9; ΣC = 1
CD = 30; KD = 4.6; ΣD = 1
where Ci is the individual cost of utilization of sensor i, Ki is the discrimination
power of the sensor and Σi is the spread factor for the sensor
The probability of false positive for the ith sensor is computed as:
P(Yi=1|X=0) = 0.5 erfc[Ti/√2]
The probability of detection for the ith sensor is computed as:
P(Yi=1|X=1) = 0.5 erfc[(Ti-Ki)/(Σ√2)]
Frequency of optimal trees with 3 sensors when
the Thresholds were varied. The fixed parameters
( CFalsePositive, CFalseNegative , P(X=1)) were selected
randomly. Fifteen trees attained rank one, out of
which tree number 37 was the most frequent.
Constants
CFalseNegative = = 5.0125x109
P(X=1) = 5.05x10-6
and CFalsePositive = 450
Tree Numbers
Frequency
Boolean Expression
27
114
00010111
29
146
00010111
2
183
00000001
49
264
01010111
51
322
01010111
25
957
00010111
23
1475
00010101
15
2437
00010011
38
4572
00011111
19
5256
00010101
1
5828
00000001
45
5873
00110111
55
10587
01111111
7
13392
00000111
37
17515
00011111
Performance (ROC) of Best Decision Tree for
Tree number 37
Performance (ROC) of Best Decision Tree for
Tree number 37
Frequency of optimal trees with 4 sensors when
the Thresholds were varied. The fixed parameters
( CFalsePositive, CFalseNegative , P(X=1)) were selected
randomly. 244 trees attained rank one, out of
which tree number 445 was the most frequent.
Only 15 most frequently occurring optimal trees
out of the 241 are tabulated below.
Constants
CFalseNegative == 4.8668x1011
P(X=1) = 7.5361x10-6
and CFalsePositive = 499.75
Tree Numbers
Frequency
Boolean Expression
445
13012
0000000001010111
145
11143
0000000000010101
11605
10958
0101011111111111
505
10545
0000000001111111
2617
10139
0000000111111111
9133
9280
0001010111111111
5761
5942
0000011111111111
11785
5910
0111111111111111
325
5574
0000000000011111
506
5249
0000000001111111
11791
5196
0111111111111111
8003
4539
0001001111111111
10783
4496
0001111111111111
386
4018
0000000000110111
87
3402
0000000000010011
Performance (ROC) of Best Decision Tree for
tree number 445
Performance (ROC) of Best Decision Tree for
tree number 445
Cost Sensitivity to Global Parameters
 Values used in the experiment:




CA = .25; P(YA=1|X=1) = .9856; P(YA=1|X=0) = .0144;
CB = 1; P(YB=1|X=1) = .7779; P(YB=1|X=0) = .2221;
CC = 10; P(YC=1|X=1) = .9265; P(YC=1|X=0) = .0735;
CD = 30; P(YC=1|X=1) = .9893; P(YC=1|X=0) = .0107;
where Ci is the individual cost of utilization of sensor i. The
probabilities have been computed for a threshold corresponding
to the equal error rate.

CFalseNegative to be varied between 25 million and 500 billion
dollars


CFalsePositive to be varied between 180 and 720 dollars


Low and high estimates of direct and indirect costs
incurred due to a false negative.
Cost incurred due to false positive
(4 men * (3 -6 hrs) * (15 – 30 $/hr)
P(X=1) to be varied between 3/109 and 1/100,000
Costs based BDT
 For each node i:
 Calculate Cterm at that node based upon
(Pi x Cfn > or < Ni x Cfp)
 For each sensor s
 Vary threshold from -4 to 4 in steps of 0.1
 Calculate the least expensive sensor cost Cs for node i
 If Cterm > Cs then introduce sensor s at node i else
terminate the branch
 Total runs = 10000
 Count the frequency of optimum trees
Structure of most frequent trees with 4
sensors
Performance of Cost Based BDT
Download