BOOLEAN FUNCTIONS AND SOME - University College London

advertisement
BOOLEAN FUNCTIONS AND SOME
ISSUES IN DISCOVERY AND INVENTION*
David A. Schum
School of Information Technology & Engineering
School of Law
George Mason University
Honorary Professor of Evidence Science
University College London
Revised: 1 April, 2005
* The author is very grateful for the support provided by the National Aeronautics and Space
Administration under contracts # 526964 and # 526965 from the NASA Langley Research Center
to George Mason University, and for the support provided by the Leverhulme Foundation and the
Economic and Social Research Council [UK] to University College London.
2
1.0 INTRODUCTORY COMMENTS
I was very fortunate to have been asked by Professor Tomasz Arciszewski to participate
in his research on a computer-based system called Inventor 2000 [and its successors], which
had, as its initial objective, improvements in the design of wind-bracing systems for tall buildings.
Equally fortunate has been my association over the past eight years with Col. Carl Hunt. I was
privileged to serve as director of Col. Hunt’s doctoral dissertation on a computer-based system
called ABEM [Agent Based Evidence Marshaling]. At the time, Col. Hunt was Commanding
Officer, Computer Crime Investigative Unit, Criminal Investigation Division, United States Army.
This system has been designed by Col. Hunt to facilitate discovery-related tasks in criminal
investigation and in many other contexts. These two research activities may seem quite
unrelated, but they are not. In a recent paper [Schum, 2001] I illustrated how these two research
activities are in fact symbiotic and commensal. They are symbiotic since ideas generated in work
on Inventor 2000 have been useful in work related to ABEM; ideas generated in work on ABEM
have been useful in work on Inventor 2000. Two major items on the common table at which both
research activities feed are the complexity of the processes under investigation in both research
ventures and the role of imaginative, creative, or inventive thought that is driven by curiosity. The
utility of Inventor 2000 goes far beyond the boundaries of the initial specific context in which it has
been applied [wind-bracing systems for tall buildings]. It is in fact an elegant test bed for
investigating a variety of issues in the study of discovery and invention. Similarly, work on the
ABEM system raises issues that have great importance in any kind of discovery or investigative
tasks regardless of the specific context in which they are encountered. How we marshal or
organize our existing thoughts and evidence greatly influences how successful we will be in
generating new thoughts and new evidence.
In my own thinking about Inventor 2000 and ABEM I have, for various purposes,
encountered the need for Boolean functions and their alternative means of expression. In my
work thus far, I have described these functions and their alternative expressions rather casually.
A major purpose of the present paper is to provide a more careful account of these useful formal
devices. My apologies go out immediately to readers for whom the developments in this paper
are already well known. My second objective of course is to try to generate new thoughts about
the very difficult matters being investigated in research on Inventor 2000 and ABEM. This paper
is being written for those interested in our present work and for those whose own thoughts may
be stimulated by the formal developments and discussion to follow. Boolean functions arise in
many contexts, especially in probability theory. Study of these functions and their alternative
means of expression provide very useful methods for dealing with complex events to which
probability measures are to be applied. But, as I hope to illustrate, these functions also arise in
the study of many other phenomena including those associated with our work on Inventor 2000
and ABEM. One current activity I will later discuss concerns Stuart Kauffman’s work on selforganizing systems and interesting phase transitions that result when such systems are
expressed in terms of Boolean functions and are exercised in various ways [Kauffman, 2000]. I
will relate this work to our studies of evidence marshaling and discovery.
The search engine that drives Inventor 2000 [and its successor Inventor 2001] is based
on the strategy of evolutionary computation as developed for the current projects by Professor
Ken De Jong. This search and optimization strategy rests on algorithms that involve the
genetically inspired processes of selection, recombination, random variation, and competition of,
in our case, alternative wind bracing and other designs. To be useful in our current work, I must
relate the following ideas concerning Boolean functions to these genetically inspired processes.
At the very least, the Boolean function ideas I now present give us a different language to employ
in our work on the search and inquiry mechanisms necessary during the processes of discovery
and invention. My major reference source on evolutionary computation is the recent work of
Dumitrescu, et al [2000]. I begin with a definition of a Boolean function and the different ways in
which these functions may be expressed and analyzed.
3
2.0 BOOLEAN FUNCTIONS AND THEIR CANONICAL FORMS
Every new faculty member beginning his/her academic career ought to have a colleague,
friend, and mentor such as the one I had when I took my first academic position at Rice University
in 1966. Professor Paul E. Pfeiffer, now Emeritus Professor of Mathematical Sciences at Rice
University, served all three of these roles for me during my entire 20 years at Rice. We shared a
common interest in probability theory. Paul had already written several notable works on
probability theory and its applications before our association began. His Concepts of Probability
Theory [ Pfeiffer, 1965] is a classic and is still available in the Dover series. His earlier Sets,
Events, and Switching [Pfeiffer, 1964] is also a classic that has been so helpful to many persons,
not just those whose interest is in electrical engineering. How honored I was when Paul asked me
to collaborate on a book entitled Introduction to Applied Probability [Pfeiffer & Schum, 1973].
Working with Paul on this book was one of the most enjoyable and profitable educational
experiences of my life. After I left Rice for George Mason University in 1985, Paul wrote a muchexpanded version of our work entitled Probability for Applications [Pfeiffer, 1990]. One of Paul’s
abiding concerns has been that students are rarely [except in his works] given very extensive
tutoring in strategies for handling complex events, i.e. compound events that involve unions,
intersections, and complementations of these events. All of my writings on evidence and
probability for the past thirty years or so carry Paul’s stamp. I could not have proceeded in my
work on evidence and inference without Paul’s wise and patient tutoring on strategies for coping
with situations in which we have many events to consider in probabilistic reasoning and in which
we seek to relate and combine them in various ways.
By definition, a Boolean function (f) of a finite class A of sets [events] is a rule of
combination of these sets based on a finite number of applications of the operations of forming
unions, intersections, and complements. It will be convenient in what follows to let this finite class
of sets [events] be represented by A = {An-1, An-2, ..., A1, A0}. As you see, A is simply a listing of n
sets or events. The particular numbering of them beginning at (n -1) and ending at zero has
useful properties to be mentioned a bit later on. For some purposes, class A may involve listings
of events that are not distinguished by subscripts; for example, A = {A, B, C}.
In writing a Boolean function of events there are certain conventions that I will follow that
simplify the writing. The intersection symbol [] is usually suppressed. Thus, A  B is written as
AB. The union symbol [  ] is never suppressed. Another convention I will follow is to use the
following symbol to indicate the complement or negation of an event: Ec = E-complement, or notE. One more convention concerns the union or disjunction of two or more mutually exclusive
events. Paul Pfeiffer used the conventional union symbol with a horizontal bar across the arms of
the symbol  to indicate a disjoint union. I cannot reproduce this disjoint union symbol on my
computer and will use instead the symbol , read "circle-plus". Thus, A  B is read "A or B, but
not both". At least one other work follows this convention [Gregg, 1998].
2.1 The Algebra of Sets: A Brief Review
Just in case you have forgotten the basic rules for combining sets in forming and
analyzing Boolean functions, here are some of the basic rules. I will note the ones that are
particularly important in the discussion to follow. First, let  represent a basic space or universal
set of all possible elements or outcomes  in some well-defined situation. In what follows I will
refer to  as the universe of discourse or simply the space of all possibilities. Then let 
represent the empty or vacuous set. In probability  is called the impossible event. Let A, B, and
C represent any subsets of .
4
Complement Rules:





A  Ac = 
A  Ac =  [i.e. A and Ac form a partition of  since A and Ac are mutually
exclusive and exhaustive of ]
c = 
c = 
[Ac]c = A.
Identity Rules:
 A=A
 A=
 A=
 A=A
Idempotent Rules:


AA=A
AA=A
Commutative Rules:


AB=BA
AB=BA
De Morgan's Rules:


(A  B)c = Ac  Bc
(A  B)c = Ac  Bc. [De Morgan's rules are very important in seeing what
happens when we decompose or express a Boolean function in different ways
using what I will later term minterms and maxterms].
Associative Rules:


(A  B)  C = A  (B  C)
(A  B)  C = A  (B  C)
Distributive Rules:


A  (B  C) = (A  B)  (A  C)
A  (B  C) = (A  B)  (A  C). [These two rules are also very important in
decomposing Boolean functions]
As you see, some of these rules involve the simplest possible Boolean functions; for
example; f1(A, B) = A  B; f2(A, B, C) = A  (B  C); or f3(A, B) = (A  B)c. But we wish to have
some way of analyzing Boolean functions that are not this simple and whose analysis requires us
to express a given Boolean function in different ways. The first method of analysis I will mention
involves expressing a Boolean function in what is called its disjunctive canonical form. As you will
5
see, this very useful method of analysis allows us to express any Boolean function in terms of the
finest grain partitioning of any basic space .
2.2 Minterms and the Minterm Expansion Theorem
Since all Boolean functions involve events and their complements, each event A and its
complement Ac forms a class of events that partitions a basic space ; i.e. the class of events Aj
= {Aj , Ajc} partitions . This I noted above in discussing the complementation property in the
algebra of sets. But now suppose that we have a class A = {An-1, An-2, ..., A1, A0} of n events. A
minterm [or minimal polynomial as it is also called (see Birkhoff & Mac Lane, 1965, 323-324)] is
n 1
the intersection set M of the form M =
Y
j 0
j
, where each Yj is either Aj or Ajc. The disjoint and
exhaustive class of all such minterms is called the partition [of ] generated by class A. Thus, A
is called the generating class for the partition and the individual Aj are called the generating
events.
As an example of minterm generation consider the class of events A = {A2, A1, A0}. We
might ordinarily suppose that none of the events in this class is empty, but this is not essential. In
addition, it is not necessary to suppose that the events in a generating class are mutually
exclusive. In the present example we have three classes of events A2 = {A2, A2c}; A1 = {A1, A1c};
and A0 = {A0, A0c}. Each of these three classes forms a partition of basic space , since the
events in each of these three classes are mutually exclusive and exhaustive of . When we
consider the joint partitioning of  in terms of these three classes of events we observe that we
will have eight intersection sets or minterms that are listed as follows:
A2cA1cA0c
A2cA1cA0
A2cA1A0c
A2cA1A0
A2A1cA0c
A2A1cA0
A2A1A0c
A2A1A0
In general, when we have n events in some generating class A, we will generate 2n minterms.
Observe that the minterms we have generated in the above example are indeed mutually
exclusive. The reason is that the pattern of complementation in each minterm is different. No
element or outcome  can reside in more than one of these minterms. This will be true in general
for n events in a generating class. The resulting 2n minterms will be mutually exclusive and also
exhaustive of a basic space .
There are different ways in which we can portray the collection of minterms generated by
some class A. The first, shown below, is tabular in nature and serves to illustrate one very
convenient way of keeping a systematic account of the minterms that are generated. This method
leads to what is called the binary designator method for numbering minterms. I will illustrate this
method using the case above in which we have n = 3 events in a generating class. First, if an
event in a minterm does not carry a complementation we assign it the number 1; if it does appear
complemented, we assign it the number 0. This of course preserves the binary nature of events in
any class Ai = {Ai, Aic}. Take minterm A2A1cA0 as an example, we assign it the numbers 101. This
is called the binary designator for the minterm. In more current literature a binary designator is
termed a bit string. The decimal equivalent for the bit string 101 is: 1(22) + 0(21) + 1(20) = 5. The
following table shows the binary designators assigned to each of the eight minterms when we
have n = 3 events, their decimal equivalents, and their minterm symbol.
6
Minterm
Binary Designator Decimal
A2 A1 A0
Equivalent
A2cA1cA0c
A2cA1cA0
A2cA1A0c
A2cA1A0
A2A1cA0c
A2A1cA0
A2A1A0c
A2A1A0
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
2
3
4
5
6
7
Minterm Symbol
M0
M1
M2
M3
M4
M5
M6
M7
This tabular binary designator account of minterms makes use of the particular ordering of events
in a generating class A that I mentioned earlier. In determining the decimal equivalent for the
numbering of minterms the subscript on an event indicates the power to which the number 2 is to
be raised. The event’s binary designator simply indicates whether or not this power of 2 is
included in the sum of powers of 2 across the events. Another example, for M 2, is: 0(22) + 1(21) +
0(20) = 2. As I noted, binary designators can also be called bit strings.
For a variety of purposes it is useful to portray a collection of generated minterms in a
variation of a Venn diagram called a minterm map. The figure below shows the minterm map for
the example in which we have n = 3 events in a generating class. The advantage of a minterm
map over a conventional Venn diagram is that it illustrates clearly the disjoint nature of the
minterms and, as I will illustrate later, it allows us to portray analyses of Boolean functions in very
orderly ways.
A2c
A0c
A2
A1c
A1
A1c
M0
M2
M4
M6
M1
M3
M5
M7
A1

A0
In the following discussion sloth overtakes me and I shall avoid having to write subscripts
on events in some generating class A. As I consider larger class of events and Boolean functions
of these events the use of subscripts gets very tedious and is unnecessary. I can still preserve
the binary designator ordering and labeling of minterms provided that I order the events in certain
ways and preserve this ordering on minterm maps I will provide. The following table illustrates my
method for the case in which generating class A = {A, B, C}.
7
Minterm
Binary Designator
A B C
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
AcBcCc
AcBcC
AcBCc
AcBC
ABcCc
ABcC
ABCc
ABC
Minterm Number
M0
M1
M2
M3
M4
M5
M6
M7
All that really matters here is that I preserve the ordering of the binary designators and assume
that A would get subscript 2, B the subscript 1, and C the subscript 0. The minterm map I
generate using class A = {A, B, C} is as follows:
Ac
Bc
Cc
C
A
B
Bc
B
M0
M2
M4
M6
M1
M3
M5
M7

One way of describing a minterm map is to say that it represents the finest grain
partitioning of a basic space  that is consistent with the definition of a Boolean function. The
reason is that such functions only permit the consideration of binary event classes such as
{A, Ac}. Its true of course that the phenomena underlying these events may have many possible
states or levels or may even exist on a continuum. In any case, these states or levels, however
many there are, can always be partitioned in a binary manner. For example, we can partition the
heights of people, in theory a continuum, into the binary class: A = persons less than five feet tall,
and Ac = persons five feet tall or over. So, if we could express some arbitrary Boolean function in
terms of minterms we could also say that this function has been expressed in terms of the finest
grain elements that such functions allow. A Boolean function expressed in term of minterms is
often said to be expressed in its canonical form. The word “canonical” in mathematics is used to
indicate some “standard” form in which a result might be expressed. However, as I will discuss,
there is more than one way in which a Boolean function can be expressed in a canonical form.
8
The first way of expressing a Boolean function in canonical form rests on an important
result called the minterm expansion theorem. This theorem is expressed as follows:
A Boolean function f of a finite class A = {An-1, An-2, ..., A1, A0} of sets [events] can be
expressed in one and only one way as the disjoint union of a subclass of the minterms
[M] in the partition generated by the class A.
This theorem can be expressed in symbols as:
F  f ( An1 , An2 ,..., A0 ) 
M
i J F
i
, where
JF is a uniquely determined subset of the [index] set JF = {0, 1, ..., 2n - 1}.
It is understood here that the union expressed in this theorem is the disjoint union of this unique
subset of minterms, since the minterms are by their construction mutually exclusive. The index
set JF here simply lists the 2n possible minterms generated by class A. Proof of the minterm
expansion theorem is given in Pfeiffer & Schum [ 1973, 169-170].
I pause here for a moment to note that this theorem informs us about how many different
Boolean functions there are when such functions involve n events. If each different Boolean
function involves a unique subset of minterms, as this theorem asserts, then all we need to do is
to determine the number of unique subsets of minterms. The answer is: 2 raised to the power 2n
possible subsets. Thus, for a generating class involving just five events, there are 2 32 =
4,294,967,296 possible Boolean functions involving these five events. In just a moment I will
mention how to determine whether two differently stated Boolean functions are in fact equivalent.
My next task is to discuss how we go about the task of expressing a Boolean function in
disjunctive canonical form in terms of minterms. Many years ago I ran across a very good
account of the necessary steps in this task in the classic text by Birkhoff and Mac Lane [1965, 3rd
ed, 322-324]. There are four steps in this process. Some of the steps need to be applied more
than once and, on occasion, some of the steps can be omitted. In addition, the order of the last
two steps may be reversed. Here first are the four steps necessary; each step involves rules in
the algebra of sets I summarized at the outset [one reason why I provided this summary]. I will
then give some examples involving Boolean functions.
Step 1: Use De Morgan’s law to move complements from outside to inside any
parenthesis; for example, (AB)c = (Ac  Bc); (A  B)c = AcBc.
Step 2: Use the distributive law for intersection to move intersections inside parentheses;
for example, A(B  C) = (AB)  (AC).
Step 3. Use the idempotency and complementary rules to omit certain terms; for
example, AA = A; A  A = A; AAc = ; A   = A.
Step 4. Write equivalent expressions for any term that does not contain n events in its
intersection. For example, for n = 2 and A = {A, B}, write A as A = AB  ABc. [Remember that 
means the disjoint union]. As another example, when n = 3 and A = {A, B, C}, write ABc as ABcC
 ABcCc.
Example 1: F = f1(A, B, C) = [A  (B  Cc)c]c
F = Ac(B  Cc) [Step 1]
9
= AcB
 AcCc [Step 2]
= AcBC  AcBCc  AcBCc  AcBcCc [Step 4]
= AcBC  AcBCc  AcBcCc
[Step 3]
f1 = M0  M2  M3.
Example 2: F = f2(A, B, C) = (AB  Cc)(Ac  C)
= (AB)(Ac  C)  Cc(Ac  C)
= ABAc  ABC  AcCc
 CCc
= ABC  AcCc
= ABC  AcBCc  AcBcCc
= M0  M2  M7.
I’ll pause here for a moment to mention why it is so often important to be able to
decompose a Boolean function into its unique collection of minterms. Later on I will give specific
examples of the necessity of performing such decompositions in contexts such as engineering
design and criminal investigations. In the two examples I have just provided the Boolean
functions f1 and f2 might represent general conditions, requirements, or statements that must be
satisfied. The decomposition of these functions into unique disjoint collections of minterms simply
provides a listing of all the specific ways in which these general conditions, requirements, or
statements can be satisfied. Being able to provide listings of all the specific ways in which some
complex Boolean expression can be satisfied turns out to have very important consequences in
discovery and invention, regardless of the context in which these activities occur. Later on I will
illustrate how these results from the minterm expansion theorem correspond to the idea of a
schema in evolutionary computation [Dumitrescu, et al, 2000, 33 - 35].
As a final note on formal issues associated with Boolean functions and their minterm
expansions I return to a consideration of the number of Boolean functions that are possible given
the n events in a generating class; the number is two raised to the power 2 n. As I noted above,
this large number refers to the number of unique subsets of the 2n possible minterms when we
have n events in a generating class. One interesting fact is that two or more apparently different
Boolean functions may in fact be equivalent in the sense that they can be expressed by the same
unique subset of minterms. Another way of saying this is to say that two requirements, conditions,
statements, or schema, expressed in terms of Boolean functions, may be saying the same thing,
even though they are expressed differently. Here is an example:
Example: Let f1(A, B, C) = A  BC; and let f2(A, B, C) = ACc  (A  B)C. Using the
minterm expansion process I just discussed, we can easily determine that:
f1 = f2 = M3  M4  M5  M6  M7.
This means that these two, apparently different, Boolean functions are in fact equivalent and are
saying the same things when they are decomposed into their specific minterm elements.
2.3 Another Canonical Form: Maxterms
10
As I have just illustrated, any Boolean function f can be expressed in canonical or
standard form in terms of the disjoint union of a unique subset of minterms. But there is another
formally equivalent canonical form of a Boolean function that arises from application of de
Morgan's laws. We first define a maxterm as:
j  n 1
Max 
Y ,
j 0
j
where Yj is either Aj or Ajc. It happens that we can express any Boolean function in terms of the
intersection of a unique subset of maxterms. This maxterm expansion will be equivalent to a
corresponding minterm expansion. I first encountered the idea of a maxterm in Paul Pfeiffer's
Sets, Events, and Switching [1964, p 75]. I also found discussions of the essential ideas of
maxterms in a probability book by Edward Thorp [1966, p 25], though he did not label these
disjunctive expressions maxterms. All he says is that we can express any Boolean function in two
equivalent forms; one of which involves the disjoint union of intersection sets and the other
involves the intersection of disjunctive sets. In the book by Birkhoff and Mac Lane [1965, p 324325] the whole idea of maxterm expansions of Boolean functions is left as an exercise for the
student. In my work on evidence and probabilistic reasoning these past years, I have never had
any occasion to use a maxterm expansion of a Boolean function; but I have had many occasions
to use minterm expansions. Consequently, I have only recently taken an interest in maxterms and
expansions of Boolean functions as intersections of unique subsets of maxterms. I was pleased
to find one recent source of information about maxterms and their use in determining another
canonical form of Boolean functions [Gregg, 1998, pp117 - 121].
There are several ways in which we can generate a maxterm expansion of a given
Boolean function. The easiest way, it seems, is to begin with a minterm expansion of this function
and then apply de Morgan's law [twice] to it. I will illustrate this process using a Boolean function
for which we already have a minterm expansion. Consider the function f(A, B, C) =
(AB  Cc)(Ac  C), whose minterm expansion, as we saw above, is: ABC  AcBCc  AcBcCc =
M0  M2  M7. In this case in which n = 3 events in a generating class, there are eight possible
minterms whose disjoint union is . The remaining minterms not in the expansion of f(A, B, C)
are: M1, M3, M4, M5, and M6. Let's first express their disjoint union: [M1  M3  M4  M5  M6].
Now observe that [M1  M3  M4  M5  M6]c = M0  M2  M7. If we now apply de Morgan's law
twice to the left-hand side of this equality, we have:
[M1  M3  M4  M5  M6]c = [M1c  M3c  M4c  M5c  M6c]
= [(AcBcC)c  (AcBC)c  (ABcCc)c  (ABcC)c  (ABCc)c
= (A
 B  Cc)  (A  Bc  Cc)  (Ac  B  C)  (Ac  B  Cc)  (Ac  Bc  C).
This last expression is the conjunctive maxterm expansion of f(A, B, C) = (AB  Cc)(Ac  C). Call
this maxterm expansion EMAX and call the minterm expansion EMIN. From the developments
above, its clear that EMAX = EMIN; they are just different, but formally equivalent, ways of
expressing a Boolean function in canonical or standard form. So, in this particular example:
ABC  AcBCc  AcBcCc = (A  B  Cc)  (A  Bc  Cc)  (Ac  B  C)  (Ac  B  Cc) 
(Ac  Bc  C).
In his work on Boolean algebra and circuits, Gregg provides a tabular method for
generating both minterm and maxterm expansions of a given Boolean function. As far as minterm
expansions are concerned, I believe the method I have employed is easier. I also believe the
method I have used for maxterm expansions is easier. However, I will provide an example of
11
Gregg's tabular maxterm expansion, since I will use this method in an illustration of some issues
associated with Stuart Kauffman's interest in Boolean functions and phase transitions. For a start,
we can think of a table of binary designators, such as the ones on pages 5 and 6 as truth tables.
Each row in these tables records answers to the questions: Do we have an A, a B, a C? In Row
M0, for Ac, Bc, Cc, the answers are: No, No and No, which we indicate by 0, 0, 0. For Row M 5, for
A, Bc, C, we have the answers Yes, No, Yes, which we indicate by 1, 0, 1. In short 0 means no
and 1 means yes in any of these rows. Here, first, is Gregg's entire tabular analysis, which I will
explain step by step. In this example I will continue to use the function
f(A, B, C) = (AB  Cc)(Ac  C).
----------------------------------------------------------------------------------------------------------------------------- --Binary Designators
(1)
(2)
(3)
(4)
[Truth?]
A B C
(AB  Cc) (Ac  C)
(AB  Cc)(Ac  C)
Maxterm
0 0 0
1
1
1
--0 0 1
0
1
0
(A  B  Cc)
0 1 0
1
1
1
--0 1 1
0
1
0
(A  Bc  Cc)
1 0 0
1
0
0
(Ac  B  C)
1 0 1
0
1
0
(Ac  B  Cc)
1 1 0
1
0
0
(Ac  Bc  C)
1 1 1
1
1
1
--_____________________________________________________________________________
After listing the binary designators, or truth table, the first step in Gregg's method is to
break up f(A, B, C) into its major parenthesized elements: (AB  Cc) in Column 1 and (Ac  C) in
Column 2. Then, going through the binary designators or truth table, row by row, we ask whether
the combination of yes (1) and no (0) indications is consistent with the terms that head Columns 1
and 2. For example, consider row 0 and the truth values [0, 0, 0]. This is consistent with
(AB  Cc) since we have Cc in this row. It is also consistent with (Ac  C) in Column 2 since we
have Ac in Row 1. So, for Row 0, we record a 1 under the terms shown in Columns 1 and 2
indicating that Row 0 is consistent with both of the terms shown in Columns 1 and 2. As another
example, consider Row 2 whose truth values are [0, 1, 0]. This row of truth values for A, B, and C
is consistent with the terms in both Columns 1 and 2. We have Cc for the term in Column 1 and Ac
for the term in Column 2. We make the same truth determinations for each row of the table of
binary designators or truth values for A, B, and C. Columns 1 and 2 show the results.
The next step is to consider the intersection of the two parenthesized terms in Columns 1
and 2; this gives us our Boolean function being analyzed: f (A, B, C) = (AB  Cc)(Ac  C). This
entire function is shown in Column 3. The truth value for this entire function will be 1 (yes) if an
only if each of its elements in Columns 1 and 2 are both true (i.e. both take the value 1). Observe
that this entire function takes the value 1 only for rows 0, 2, and 7. [Recall that f(A, B, C) in this
case has the minterm expansion: M0  M2  M7].
The third step focuses on those instances in which, for f(A, B, C) in Column 3 the truth
value is zero. In this final step we take the union of A, B, and C in each case and then
complement any term in these expressions that takes a 1 in its corresponding binary designator.
For example, consider Row 1 and its truth values [0, 0, 1]. We form the maxterm for this row by
adding a complement to C in this disjunction since C takes the value 1 in the binary designator.
Thus, for Row 1 we have the maxterm (A  B  Cc) shown in Column 4. As another example, for
Row 6 and its binary designator [1, 1, 0], the maxterm in Column 4 is: (Ac  Bc  C). If you
12
compare the maxterms in Column 4 generated by this tabular method, you will see that they are
the same as those I generated using de Morgan's laws twice over starting with the minterm
expansion for f(A, B, C) = (AB  Cc)(Ac  C).
In some instances, this tabular method might be quicker than the de Morgan laws
method. The truth is that neither method is very speedy when we have Boolean functions to
consider in which the number n of events in their generating class is very large. In some
instances we will get lucky and be able to observe by inspection of a minterm map which
minterms appear in an expansion of some Boolean function. When this does not happen,
however, we know that there are two formally equivalent ways of expressing a Boolean function
in canonical form, as I have just illustrated. As I noted above, and will illustrate further below, the
virtue of minterm expansions is that they provide us with an account of all specific and unique
instances of conjunctive combinations of the events in some generating class of events that
occurs in a Boolean function of interest. The first two applications of Boolean functions I will now
mention make use of minterm expansions.
3.0 SOME APPLICATIONS OF BOOLEAN FUNCTIONS
Here is a collection of thoughts about how Boolean functions and their canonical forms
may be usefully employed in three related areas of ongoing research. All three of these areas
involve processes having great complexity in which efforts are being made to discover or
generate new ideas or to invent new engineering designs. As I noted earlier, I believe there to be
common elements of these activities as they involve discovery and invention.
3.1 Inventor 2000/2001
In 2001 I tried my best to generate some ideas I hoped were useful in work with Tom
Arciszewski, Ken De Jong, and Tim Sauer on Inventor 2000/2001. In another document [Schum,
2001] I have mentioned some thoughts involving Boolean functions and their minterm expansions
that may serve to stimulate the process of inquiry regarding the evolutionary mechanism
according to which Inventor 2000/2001 generates new wind-bracing designs for tall buildings. The
computational engine in this system makes use of the evolutionary processes of mutations and
recombinations [e.g. crossovers], both of which can be construed as search mechanisms
[Kauffman, 2000, pp 16-20]. This evolutionary process also involves selection since at each new
step of the evolutionary process, only the fittest designs are selected and allowed to "mate" to
produce new designs at the next iteration. In early studies using Inventor 2000, the fitness
criterion was very simple and involved only one measure, namely the physical weight of the
design. Multivariate and, presumably, nonlinear fitness functions are being contemplated.
One rather obvious element of the complexity of the evolutionary process in Inventor
2000/2001 concerns the size of the design space to be searched. All engineering designs have
attributes, features, or characteristics; wind-bracing designs have many such attributes. The wind
bracing designs of concern in our early studies each had 220 attributes. But any design attribute
has a number of possible states or levels. In these early studies, 108 design attributes had 7
possible states, 108 had 4 possible states, and 4 had 2 possible states. This makes the total
number of different designs in this space to be T = [7 108][4108][24]  29.76(10)156, a preposterously
large number. Inventor 2000 incorporated a feasibility filter that automatically eliminates any
combination of attribute settings that would produce an infeasible or foolish design. Even if only
one in every million designs were accepted by this filter as feasible/sensible, we would still have
T*  29.76(10)150 possible designs to search through in hopes of finding the fittest designs. If a
computer could generate (10)6 new designs every second [one every microsecond], it would take
this system 9.44(10)143 years to generate all the possible designs in this space 1.
1
At (10)6/sec., this makes 6(10)7/minute, 3.6(10)8/hr; 8.64(10)10/day, and 3.154(10)13/year. But,
since there are about 29.76(10)156 possible designs, generating all of them would take about
13
Looking through everything in the hope of finding something does not make sense, even
when we have possibility spaces not nearly as preposterously large as the one in our studies
using Inventor 2000/2001. This system makes use of search processes that mimic evolutionary
mechanisms [mutations, crossovers, and selections] that one might say are tried and true, since
nature has apparently used such mechanisms to produce an enormous diversity of species,
including homo sapiens sapiens, many of which have a degree of fitness that has allowed them to
survive for very long periods of time in the face of many environmental constraints. The issue of
the fitness of engineering designs raises even more issues of complexity. Clearly, the fitness or
suitability of an engineering design is a multiattribute characteristic. At this point it seems an open
question regarding just how many attributes ought to be considered in evaluating the fitness of
designs for wind-bracing or other engineering systems. Just identifying these individual fitness
attributes is not enough; we need to assess their relative importance and, most importantly, to
specify how they might combine in determining overall fitness. In most cases we can easily
suppose that these fitness attributes interact in very subtle ways. This brings more combinatorics
to mind since, if we had some number k of fitness attributes, there are 2 k - (k + 1) possible
interaction patterns to consider. One point here regarding the existence of complex interaction
patterns is that overall fitness functions will almost certainly be nonlinear in nature. So, in Inventor
2000/2001 has complexity all over the place.
If we construe the mechanisms of Inventor 2000/2001 as being search processes alone
we realize that a major element of the discovery of new and fitter designs lacks one essential
ingredient, namely inquiry, the asking of questions. Its a fair guess that, at no points in the last
several billion years on earth, did nature stop the evolutionary process to see how well it was
proceeding and to ask how it might be improved. But we certainly have this capability. Ken De
Jong has wisely proposed that the evolutionary computational mechanisms in Inventor 2001 be
made “tunable” in the sense that we are allowed to adjust these mechanisms at various points in
order for them to operate more effectively and efficiently. Effective here presumably means
assured evolutionary convergence toward maximal fitness regions in the preposterously large
design space. Efficiency presumably means the speed at which such convergence might take
place. In other words, how can we converge to these maximal fitness regions, wherever they may
be, in the smallest number of evolutionary steps or stages? Such capabilities would certainly
enhance the applicability of Inventor 2001, and its successors, in any area of engineering or in
other contexts in which design improvements are continually being sought. My thoughts now turn
to the process of inquiry. In order to decide how best to make Inventor 2001 tunable we have to
begin by asking some questions, knowing that not all of these questions will be productive and
also knowing that some of these questions may seem quite impertinent.
One major element of Carl Hunt’s work on the ABEM system is this system’s ability to
help the user to ask better or more strategically important questions during investigative activities,
such as the criminal investigations in which he is particularly interested. By strategically important
I mean that a question leads an investigator along a new productive line of inquiry. I shy away
from saying that a strategically important question is always the “right” question. The reason is
that we may have to ask a sequence of questions whose answers may eventually lead us to ask
one that is “right” in the sense that its answers allow us to generate a hypothesis that contains
some truth. When I began to think about how I could contribute to research on Inventor 2000 I
had virtually no idea about what questions I should be asking about this system. This was due in
part to the complexity of this system’s activities as well as the complexity of the evolutionary
process this system attempts to capture in computational terms. It is also true that I am neither a
structural engineer nor a computer scientist.
As anyone who has ever studied the process of discovery knows, thought experiments
and simulations are very useful. Indeed, they are frequently the only kind of studies one can
perform, given the difficulty of performing more conventional empirical studies of the suitability of
29.76(10)156/3.154(10)13 = 9.44(10)143 years.
14
any methods alleged to enhance discovery-related activities. I had already discovered the
difficulty of doing conventional empirical evaluations in my work on the design of computer-based
systems to enhance the process of marshaling thoughts and masses of evidence in investigations
in many contexts [e.g. Schum, 1999]. In my view, Inventor 2000/2001 might just as well be called
Discoverer 2000/2001, because it is attempting to bring to light the fittest designs [there may be
many of them] whose attribute combinations already exist among the T  29.76(10)156 designs
that are possible. As I thought about how to begin to generate useful questions and thought
experiments or simulations, it seemed obvious that I would need to consider much simpler
situations. First notice above on page 12 that only four of the 220 wind-bracing design attributes
are binary in nature. Suppose all of these design attributes were binary. In this case there would
then be 2220  1.685(10)66 possible designs [instead of T  29.76(10)156 designs that actually exist
at present]. Not much real simplification here! Another possibility of course is to imagine that
there are fewer design attributes, all of which are binary in nature. This is where I began to think
about the possibility of Boolean functions, and their minterm expansions, as useful devices for
simulating the evolutionary processes captured by Inventor 2000. As we all know, simulations are
useful to the extent that they capture faithfully the most critical elements of the phenomena and
activities being simulated. This issue is termed: simulation fidelity. It seemed that I could capture
the essential evolutionary processes involving mutations, crossovers, and selection of the fittest
designs to use as parents in the generation of possibly even fitter designs. Here are some
connections between Boolean functions, minterms, and these evolutionary processes.
3.1.1 Designs as Chromosomes or Minterms
To begin, suppose that instead of there being 220 wind-bracing design attributes there
are just six such attributes, each of which has only binary states. In this highly simplified situation,
the generating class A = {A, B, C, D, E, F}, where {A, Ac}, {B, Bc}, {C, Cc}, {D, Dc}, {E, Ec}, and {F,
Fc}, are the binary states of each individual event [attribute] class. In this case there are 2 6 = 64
possible minterms, each of which constitutes a unique design attribute combination. For example,
one design is (ABCDcEFc), whose binary designator, or bit string, is (111010) and whose minterm
number is M58. In words, we can say that this design has an A, B, C, and E, but no D and no F.
The following minterm map shows the numbering for the 64 minterms in this special case.
Ac
Ec
A
Bc
B
Bc
Cc C
Cc C
Cc C
B
Cc C
Fc
0
8
16
24
32
40 48
56
F
1
9
17
25
33
41 49
57
Fc
2
10 18
26
34
42 50
58
F
3
11 19
27
35
43
51
59
Fc
4
12 20
28
36
44
52
60
F
5
13 21
29
37
45
53
61
Fc
6
14 22
30
38
46
54
62
F
7
15 23
31
39
47
55
63
Dc
E
Ec
D
E
15
Figure 1
In the evolutionary computation literature, the minterm map just shown would be called a
search or representation space [X], whose individual members [minterms] are x  X. The
members x  X are variously called: chromosomes, genotypes, genomes, or individuals
[Dumitrescu, et al, 2000, 12]. As I will illustrate later, one advantage of a minterm representation
for a design is that it identifies and distinguishes between genes on a chromosome [design] in a
way that a bit string cannot do. One related matter concerns the formulation of Boolean functions.
Consider again M58 = (ABCDcEFc), whose bit string is (111010). We cannot form any Boolean
function using just a bit string; such functions require identification of the events whose possible
binary states are indicated by 0 or 1. I do understand however that computation involves bit
strings.
As I described earlier, a given Boolean function can be decomposed into a unique subset
of either minterms or maxterms. There is an interesting relation between minterms and a term
employed in evolutionary computation; the term is schema. Consideration of this term lets me
introduce some additional ideas related to possible uses of Boolean functions in our work. First,
suppose all chromosomes [or designs] of concern have n genes [design attributes] that each
have binary states. We know that there are 2n possible chromsomes, genotypes, or genomes; in
other words X = the complete set of these chromosomes. As just noted, X can be interpreted as a
minterm map. But, since each chromsome is assumed to have binary states, we can also
represent X, the chromosome space, as an n-dimensional hypercube: X = {0,1}n. A schema of the
chromosome space X is a string of length n involving the three possible symbols 0, 1, and *,
where the asterisk symbol represents what is termed a "wild card" or a "do not care" entry [i.e. it
could be either 0 or 1, we don't care which]. Consider the schema S = 1 * * 0. In bit string terms,
there are four chomosomes that are represented by this schema S; they are: 1000; 1010; 1100;
and 1110. In event form, using minterms for this four variable case, the associated minterms are:
ABcCcDc; ABcCDc; ABCcDc; and ABCDc. In words, what this schema says is that we have a
selection of all possible four variable minterms having an A but not a D. But what was just said is
a simple Boolean function f(A,B,C,D) = (ADc). This schema is just another way of representing all
the chromosomes, minterms, or designs that satisfy f(A, B, C, D). The minterm expansion
theorem simply allows us to identify all the designs or elements in a schema. A bit later I will
illustrate how a variety of useful indices including fitness criteria and feasibility filters can be
expressed as Boolean functions, any of which can be decomposed into a schema consisting of a
unique subset of either minterms or maxterms. Simply stated, minterm expansions of Boolean
functions act to identify all the possible instances of any schema that may be of interest.
3.1.2. Selection Mechanisms
The basic mechanisms employed in evolutionary computation involve selection, mutation,
and some form of recombination of genetic elements. Such processes take place over time (t)
and have some starting point t0. Let U(t0) represent the initial total universe or population of
chromosomes/minterms available before any initial selection is made; i.e. U(t 0) = X. Remember
that X here represents the entire collection 2n of chromosomes/minterms of length n. In work on
Inventor 2000, U(t0) = T  29.76(10)156 designs that are possible. A digression is necessary at this
point because of how chromosomes/minterms are described in the evolutionary computation
literature I have read. It is asserted that the genetic algorithms involved in evolutionary
computation suppose a population of independent individuals or chromosomes [e.g. Dumitrescu,
et al, 2000, 26]. I believe there is a mistake in using this term independence with reference to
individuals, designs, chromosomes, or minterms. I have gone to considerable lengths to show
that chromosomes, designs, or minterms in U(t0) = X are mutually exclusive. This means that they
cannot be independent [in a probabilistic sense]. The terms independence and mutual exclusivity
are two of the most frequently misused terms in all of probability theory. Many persons use them
interchangeably, when they are not; if we have one we cannot have the other. The reason follows
from elementary probability. Suppose two events E and F, each with non-zero probability. If these
events are independent, then P(E|F) = P(E), which entails that P(EF) = P(E)P(F) > 0, since P(E) >
16
0 and P(F) > 0. But now suppose that the same events E and F, each having non-zero
probability, are mutually exclusive, in which case P(EF) = 0. They cannot be independent since
this would require P(E)P(F)  0, since P(E) > 0 and P(F) > 0. What all this says is that any design
can only have one unique chromosome or minterm.
Now, there are three instances in which independence does arise in the consideration of
chromosomes, designs, or minterms. The concept of Boolean functions helps us to identify the
first form of independence. In decomposing a Boolean function into minterms or maxterms, no
specification is required regarding whether the events in the Boolean function are either
independent or mutually exclusive. For example, suppose f(A, B, C) = ABc. Decomposed into
minterms, f(A, B, C) = ABcC  ABcCc, regardless of how we might believe events A, B, C [and
their complements] to be related. Now here's the point: calculating a probability of any minterm
does require knowledge about the relationship between events in a minterm. For example,
suppose we believe that events A, B, and C are completely independent, then P[f(A, B, C)] =
P(ABc) = P(A)P(Bc)P(C) + P(A)P(Bc)P(Cc). Stated in genetic terms, two different chromosomes
are by nature mutually exclusive. However, their genetic elements may or may not be
independent; perhaps the existence of an allele of one gene makes the existence of an allele in
another gene more probable or less probable. A second independence at issue is of great
importance in studying the fitness of chromosomes/designs/ minterms. Genes in a given
chromosome may interact, or be nonindependent, in their influence on the fitness of
chromosomes. Here's where nonlinearity and great complexity enter the picture. Trying to
discover the nature of these interactions among genes that influence fitness is perhaps the
greatest challenge facing discovery in the evolutionary process. A final form of independence
concerns the selection process itself, to which we now return. The conditions of sampling or
selection of chromosomes governs this form of independence. If selection is made with
replacement, then the probability of generating [selecting] a chromosome does not change from
trial to trial, in which case we say that the trials are independent. But, the actual selection of
chromsomes in evolutionary computation proceeds along other lines, as I will now describe.
Returning to U(t0) = X, the initial universe or population of individual chromosomes,
suppose a selection is made from this initial universe to begin the genetic processes of mutation
and/or mating [for purposes of recombination or the transfer of genetic elements among
chromosomes]. Let this initial selection of individual chromosomes be represented by M(t 0),
where M(t0) is a subset of U(t0) = X. We might refer to M(t0) as the initial conditions of the
evolutionary process since it is from this initial selection of chromosomes that the evolutionary
process gets started. In many examples of evolutionary computation, random sampling from
U(t0), assuming a uniform probability distribution, is used to select members of M(t 0). But there
are other strategies, one of which is called partially enumerative initiation [Dumitrescu, et al,
2000, 31 - 32]. This strategy makes use of the schema, discussed above, that can be
represented as Boolean functions. Using some fitness criteria, suppose that various interesting
schema of a specified size are identified. This form of sampling involves assuring that at least one
instance of each identified schema is included in the sample selected. Another strategy is called
doping. Using this strategy, M(t0) might be "doped" by the insertion of some very fit
chromosomes. This assumes, of course, that fitness is readily recognizable or that it might be
easily guessed or inferred. Some knowledge of fitness formed the basis for the feasibility filter in
Inventor 2000 that I mentioned earlier.
There's another very interesting matter concerning selection that I have given some time
trying to analyze formally; it concerns variability in the gene pool of selected samples such as
M(t0) as well as in other samples selected at later stages in the evolutionary process. As we
know, genetic variability among parent chromosomes being mated acts to promote diversity in the
genetic characteristics of offspring. Ensuring genetic diversity at each stage of the evolutionary
process seems one major objective as the process lurches toward regions of greater fitness.
Absent such variability, the process might wallow in some non-ideal region of a fitness landscape.
This is one reason why incest is not especially adaptive. Mutations at various stages of the
evolutionary process also help to prevent wallowing at a certain fitness level.
17
Here is a strategy in which an attempt is made to promote genetic variability in the
subsequent offspring of some collection of parent chromosomes. We first need to determine how
many different chromosome pairs are possible in some universe or population from which some
selection is to be made. The following arguments apply whether the population of concern is U(t 0)
or some population that arises at a later stage in the evolutionary process. The major assumption
in what follows is that the genetic elements of any individual chromosome exist in binary states,
such as those I have considered in my discussion of minterms. First, we already know that, when
chromosomes have length n, there are 2n possible different chromosomes/minterms. There are 4 n
possible pairs of chromosomes in some U(t0) = X. This arises from the fact that there are four
possible pairings of the binary genetic elements of any chromosome. For example, if {A, Ac} form
the binary states of gene A, then we can pair these states in two chromosomes in the following
four ways: (A, A); (A, Ac); (Ac, A); and (Ac, Ac). Now, analyses of genetic variability or diversity
among pairs of chromosomes requires that we have a way of determining how many
chromosome pairs, among the 4n that are possible, have exactly k genetic elements in common,
where 0  k  n. As I have shown elsewhere [Arciszewski, Sauer, & Schum, 2002, 55-56], letting
(k) be the number of pairs of chromosomes having exactly k genetic elements in common,
(k) = C(n, k)2k2n-k = C(n, k)2n.
[I use the expression C(n, k) to represent the process of selecting k elements from n
distinguishable elements with replacements not allowed. As we know, C(n, k) = n!/k!(n-k)!].
Across values of k, the distribution of (k) is symmetrical, as one would expect given the binary
nature of the genetic elements involved in these combinatorics.
Now, what we need to show genetic variability in chromosome pairs is not the number of
elements they have in common, but the number of elements that are different. This turns out to
be exactly what the Hamming distance shows us. For any two bit strings of equal length, the
Hamming shows the number of different elements in the strings. For example, suppose the two
bit strings (0010) and (0101). Their Hamming distance is 3 since the last three bits in each string
are different. Chromosome pairs whose Hamming distance is largest will contribute most to
genetic variability.
.
I cannot easily illustrate the initial or later selection method I have in mind using short
chromosomes, or simple designs, such as those shown in the minterm map above on page 14.
Suppose instead a situation in which chromosomes/minterms/ designs have twenty genes or
attributes, each of which exist in binary states. This represents an initial universe U(t 0) = X =
1,048,576 possible chromosomes/designs. In this situation there are 420 = 1.099512(10)12
possible different chromosome/design pairs. Table 1 on the next page shows how many of these
chromosome pairs have Hamming distances between zero and twenty.
The situation shown in Table 1, though analytically tractable, still results in astronomical
numbers and begins to resemble the complexity of the actual design situation faced in Inventor
2000. Because of the symmetry of this unimodal distribution, the mean, median, and mode
coincide. If a single chromosome pair were picked at random from this distribution, the most likely
consequence would be that this pair has Hamming distance = 10. This table is helpful in
illustrating various strategies that may be employed in determining an initial M(t 0) or a similar
mutation/mating population at any later stage in evolutionary computation. The purpose of such
strategies, again, is to ensure that the gene pool has high variability at each stage of evolutionary
computation. In the case shown, involving 20 binary genes [design attributes], we see that there
are over a million chromosome pairs that have Hamming distance = 20. However, this represents
just a vanishingly small proportion of the total number of pairs. If the individuals in M(t 0) were
selected from among these 1, 048,576 pairs, this would ensure that the initial conditions of the
evolutionary process favored the most variable gene pool possible in the case in which binary
chromosomes are of length twenty. Less stringent strategies are possible of course. Observe in
18
Table 1 that just six tenths of one percent of all chromosome pairs have Hamming distance at
least 16. but this is still a very large number [6,496,976,896 chromosome pairs]. Another
consequence of the distribution in Table 1 is that, if the selection of pairs for M(t 0) were done at
random, about 73.6% of these pairs would have a Hamming distance between 8 and 12.
Knowledge of the proportion of common genetic elements in chromosome pairs, such as Table 1
below provides, is helpful in decisions about what initial conditions might be constructed for any
run of evolutionary computations.
Common Elements
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Totals
Hamming
Distance
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
(k)
1,048,576
20,971,520
199,229,440
1,195,376,640
5,080,350,720
1.62571(10)10
4.06428(10)10
8.12856(10)10
1.32089(10)11
1.76118(10)11
1.93731(10)11
1.76118(10)11
1.32089(10)11
8.12856(10)10
4.06428(10)10
1.62571(10)10
5,080,350,720
1,195,376,640
199,229,440
20,971,520
1,048,576
1.099512(10)12
Proportion of Total Pairs
0.000+
0.000+
0.000+
0.001
0.005
0.015
0.037
0.074
0.120
0.160
0.176
0.160
0.120
0.074
0.037
0.015
0.005
0.001
0.000+
0.000+
0.000+
1.000
Table 1
Cumulative Proportion
0.000+
0.000+
0.000+
0.001
0.006
0.021
0.058
0.132
0.252
0.412
0.588
0.748
0.868
0.942
0.979
0.994
0.999
0.999+
0.999+
0.999+
1.000
I now return to the selection process as it unfolds over time. I hope Figure 2 will help
clarify various choices we have in allowing evolutionary processes to unfold.
19
As I noted above, in my present example involving chromosomes having twenty binary
genetic elements, there are U(t0) = 220 = 1,048,576 different or unique chromosomes. From this
number we select M(t0) chromosomes, according to a certain strategy, to represent the initial
mating/mutation population that will result in our first generation of "children" chromosomes that
are obtained from the recombination and/or mutation of the chromosomes in this selection. As
Figure 2 illustrates, let C(t1) represent the first generation of "children" produced by this
mating/mutation process. I understand that it might not be entirely appropriate to call the new
individuals in C(t1) "children" if the first evolutionary process only involved mutations. We could of
course call them "single-parent children". In any case, here is where fitness criteria enter the
picture, if they have not already done so in the initial selection of M(t0) . Suppose some selection
of the "fittest" children in C(t1) is made to form some new mating population M(t1). In Figure 2 this
selection process is indicated by the bold arrow from C(t 1) to M(t1). But in some instances of
evolutionary computation I have seen, individuals from the "parent" population M(t 0) that
produced C(t1), might be included as well in the formation of the new mating population M(t 1).
This I have indicated by the thin arrow [with a question mark] in Figure 2. The idea seems to be
that not all of the new children in the first generation are necessarily more fit than their parents;
some may be even less fit than their parents. So, particularly fit parents might be included in the
mating population that will produce a second generation [C(t2)].
There are several concepts associated with selection that I should mention here. The
first, termed the generation gap, refers to the percentage of a population that is to be replaced in
each generation. As an example, in Figure 2, at t1, we began with a collection of individuals M(t0),
those initially selected to form a mating/mutation pool. As a result of this mating/mutation process,
we now have our first generation containing new chromosomes C(t 1). The generation gap here
refers to how many of these individuals will be selected for replacement to form the second
generation. Another term is selection pressure, which refers to the degree to which highly fit
individuals in one generation are allowed to produce offspring in the next generation. Consider
again C(t1) in Figure 2. Various strategies are possible in selecting from among these individuals
to form the mating population M(t1) whose offspring will form the new generation C(t2). If selection
pressure is low, then each individual in C(t1) has a reasonable chance to be included in M(t1).
High values of selection pressure favor the fittest individuals in each generation. Two correlated
consequences of selection are termed genetic drift and premature convergence. Genetic drift is
usually associated with a loss of genetic diversity and causes an evolving population to cluster
around a particular fitness value, even when there are other higher fitness regions. Premature
convergence is a way of saying that evolving populations converge to only locally optimal regions
in fitness space. This can also happen when a few comparatively fit individuals are allowed to
dominate by being selected again and again in successive mating/mutation processes. This acts
to decrease genetic diversity. In general, there seem to be two major objectives to be served in
the selection process. The first is to ensure high reproductive chances for the fittest individuals.
The second is to preserve maximum genetic diversity in order to explore as much of the search
space as is possible. Different evolution strategies involve various tradeoffs between these two
objectives.
3.1.3 Search Involving Recombinations and Mutations
The next issue to be examined concerns the various forms of mating that have been
identified [Dumitrescu, et al, 2000, 120 - 121]; they identify seven different mating paradigms. I
can easily illustrate these seven methods using the minterm map on in Figure 1 on page 14 for
binary chromosomes/minterms of length six. The first thing I will note is that there are 46 = 4096
possible chromosome pairs in this six-variable case. I will also use the notation for collections of
20
chromosomes I introduced in Figure 2 above on page 18. First, suppose that U(t 0) = X is the
entire minterm map shown in Figure 1 on page 14. From this collection of 64 possible
chromosomes/minterms we are to select some number M(t0)  64 for mating operations and
possibly mutations. The first, and apparently the most popular, method involves random mating in
which mates [pairs of chromosomes/minterms] are selected at random from M(t0). A second
method involves inbreeding in which similar parents are intentionally mated. For example, we
might choose to mate minterms M50 and M58 because their Hamming distance is just 1 [M50 has
Cc and M58 has C]. The next mating strategy is called line breeding where one unique very highly
fit individual is bred with other members of M(t0) and the offspring are selected as parents for the
next generation. Suppose M44 is the lucky chromosome that gets chosen to mate with each one
of, say, 20 other minterms that are randomly chosen to form M(t 0). There is, of course, an
important question gone begging in this strategy; it assumes that we have a well-defined fitness
function and could measure the fitness of every chromosome/minterm on the map. If we knew
what was the fittest one of the lot, we would have no search problem on our hands.
A fourth method is called out-breeding in which very different individual
chromosomes/minterms are mated. Extreme examples in the minterm map on page 14 are M0
and M63, which are among the C(6, 0)26 = 64 chromosome pairs whose Hamming distance is the
maximum 6. Having selected this chromosome pair, we might then select any of the other
C(6, 1)26 = 384 chromosome/minterm pairs that have only one genetic element in common. The
fifth method is called self-fertilization in which an individual is combined with itself. I may be
missing something here, but I fail to see what is accomplished by this method. Crossovers would
not, by themselves, produce new children; the parent would continue to clone him/herself. Only if
mutations were added would any new children eventually result. A sixth method, called cloning,
occurs when an individual chromosome is never replaced and is added, without modification, to
every new generation. A seventh method is called positive assortive mating and occurs when
similar individuals are mated. In the minterm map on page 14, exactly C(6, 6)2 6 = 64
chromosome pairs will have all six genetic elements in common and exactly C(6, 5)26 = 384 will
have exactly five genetic elements in common. The seventh method, called negative assortive
mating, sounds very much like out-breeding where dissimilar individuals are mated. An eighth
method might be added to the list that involves incestual matings of various sorts; parents mating
with children, children of the same parents mating together, and even children mating with their
grandparents. In simulations I have performed, not all of these incestual matings simply produce
copies of already generated chromosomes. However, such matings do severely restrict genetic
diversity.
Following are various strategies that have been considered for recombination of
chromosome elements, not all of which involve crossovers of the genetic elements of two "parent"
chromosomes; and not all of which are observed in nature. I begin with recombinations alone and
then later consider them in combination with mutation operations. Such recombination and
mutation operations are designed to enhance the exploration of more extensive regions of a
fitness landscape. Here first is a selection of crossover strategies that have been employed; there
are others as well.
Single Point Crossovers: Suppose two chromosomes having n genes. Crossovers, or the
exchange of genetic elements among these two chromosomes, are accomplished by selecting a
crossover point for this exchange. If chromosomes have n genes, then there are n - 1 single
crossover points. In most cases of evolutionary computation single crossover points are selected
at random. Here is a case involving two parent chromosomes P1 and P2 where each parent has
six binary genetic elements. Crossover point 2 has been selected and, after the exchange of
genetic elements, two children chromosomes C1 and C2 result:
P1
P2
C1
A
Bc
Ac
B
A
Bc
C2
Ac
B
21
C
D
Ec
Fc
Cc
D
E
F
Cc
D
E
F
C
D
Ec
Fc
M44
M23
M39
M28
Referring to the six-variable minterm map on page 14, we mated minterms M44 and M23 and, after
crossover, will produced M39 and M28 as children. It is apparent that children different from their
parents will usually [not always] result from different crossover point settings. Such differences
will not happen when genetic elements are the same for each parent below a single crossover
point.
Two-Point Crossovers: In the case of two chromosomes each having n genetic elements,
there are C(n -1, 2) possible settings of two crossover points. Following is an example when n = 6
and we have the same two parents [M44 and M23] as in the previous example.
P1
P2
C1
C2
A
Bc
Ac
B
A
Bc
Ac
B
C
D
Ec
Cc
D
E
Cc
D
E
C
D
Ec
Fc
F
Fc
F
M23
M38
M29
M44
As you observe, the children of these same two parents differ from those produced by the singlepoint crossover.
N-Point Crossovers: The number of possible crossover points is obviously limited by the
number of genes on any pair of chromosomes. One final example involves the same two parents
M44 and M23 with three crossover points as shown:
P1
P2
C1
C2
A
Ac
A
Ac
Bc
C
B
Cc
B
Cc
Bc
C
D
Ec
D
E
D
Ec
D
E
Fc
F
F
Fc
22
M44
M23
M54
M14
As you see, different children are produced here from those produced by either of the two other
crossover forms I have just illustrated.
A very interesting issue arises here concerning crossovers, schema and their associated
Boolean functions. The use of just a single crossover point can cause difficulties because it can
prevent us from generating new children that are associated with some given schema of interest.
Remember that schema can be expressed as Boolean functions for binary genetic settings.
Schema can represent such things as expected successful or fit genetic combinations. Consider
the following situation in which we have chromosome pairs having, say, eleven genes A through
K. Here are two schema that I will express in two ways:
S1 = (01* * * * * * * 11) = (AcB * * * * * * *JK), and
S2 = (* * * 101 * * * * *) = (* * * DEcF * * * * *).
First consider S1 and the following Boolean function and its interpretation.
Let f1(A, B, C, D, E, F, G, H, I, J, K) = (AcBJK) This schema says: "Offspring will be fit to degree Y
if they have B, J, and K, but not A". Now let f2(A, B, C, D, E, F, G, H, I, J, K) = (DEcF). This
schema says: "Offspring will be fit to degree Y if they have D and F, but not E". We can, of
course, combine these two Boolean functions to read: "Offspring will be successful to degree Y if
they have either B, J, and K, but not A, or if they have D and F, but not E". The Boolean function
here will be f1,2 = (AcBJK)  (DEcF). Minterms or bit strings associated with f1,2, as just described,
will have the following form
S3 = (01 * 101 * * * 11) = (AcB * DEcF * * * JK).
As I mentioned earlier, the minterm expansion theorem assures us that we can identify all the 24
= 16 specific chromosomes [minterms] that will be associated with S3.
As discussed by Dumitrescu et al [2000, 109], if we mate two chromosomes represented
by S1 and S2, and if we adopt a single-point crossover, we will be unable to have, as offspring,
children that satisfy S3. But we will have such desireable offspring if we choose a two-point
crossover operation instead. It happens that single crossover points can act to prevent the
occurrence of offspring that are associated with complex fitness specifications. This is one reason
why nature often employs multiple crossover points, and so do many persons engaged in studies
employing evolutionary computation.
Adaptive [Punctuated] Crossovers: Crossover operations need not be fixed throughout
the evolutionary process; indeed these operations can be made to evolve themselves in light of
results obtained in preceding generations. One such method, adaptive [punctuated] crossover, is
described by Dumitrescu et al [2000, 110-111] as a method involving the self-adjustment of
crossovers as evolutionary computation trials proceed. If a selected crossover point produces a
good outcome it is retained; if not, it is allowed to die and new ones are introduced. Such strategy
requires recording of crossover points along with the genetic information in any chromosome.
Segmented Crossover: This is a variation of the N-Point crossover method in which the
number of crossover points is not held constant over trials. Different numbers of crossover points
are selected randomly as trials proceed.
23
Uniform Crossover: In this method there are no pre-defined crossover points. Instead, the
state of any gene in an offspring is selected at random, according to various probability
distributions, from the gene states of its parents.
I next consider the operation of mutation as an evolutionary operation. The effect of this
operation is generally to change the state of an individual genetic element in a chromosome. For
example, in a chromosome having gene state A, this state may be mutated to state Ac. It happens
of course that the states of more than one gene in a chromosome may be altered by mutation. In
evolutionary computation, as in the evolution of living organisms, mutation is a probabilistic [or
stochastic] rather than a deterministic process. A mutation probability, or mutation rate, pm is
defined. This probability refers to the likeliness that any single genetic element of any
chromosome will be altered by mutation. Suppose at time ti there are N chromosomes in some
search space, each having n binary genes. This means that there are nN total gene states in this
space or population at time ti. Thus, on average, there will be nNpm gene states in this population
that will undergo mutation at time ti. As expected in searches involving evolutionary computation,
mutation rates can be varied and many studies have been performed to find optimal values of p m
in particular situations. Usually, however, mutation rate pm is set at very small values, typically in
the range [0.001, 0.01]. It happens that mutation rates can be tuned or varied as the evolutionary
search process proceeds. In other words, mutation rates need not be kept stationary as the
process moves along. Suppose, as this search process continues, there is convergence to
conditions of greater fitness on a fitness landscape. Mutations at this point might be disruptive.
Various strategies have been employed to overcome such difficulties [Dumitrescu, et al, 2000,
139 - 144]. Some involve schemes for making pm time-dependent, such that pm decreases over
time during trials. Other methods act to decrease pm as measured fitness of chromosomes in new
generations increases.
If evolutionary computations relied only on random mutations, convergence to regions of
greater fitness might take a long time. Recombinations, in the form of crossovers, can speed up
the process. A major role of mutations is to help prevent the loss of genetic diversity and helps
prevent premature convergence. A result is that larger regions of a fitness landscape can be
explored. However, Stuart Kauffman [2000, 16 - 20] has recently commented on the suitability of
search procedures involving mutations, recombinations, or both. It might seem that search
processes in the absence of metaphoric mating and recombination [i.e. mutation alone] are quite
useless. However, Kauffman argues that, as search strategies, recombinations are only useful on
smooth, highly correlated fitness landscapes where regions of greatest fitness all cluster together.
He further argues that, averaged across all possible fitness landscapes, one search procedure is
as good as another. This brings me to my final topic as far as Inventor 2000/2001 is concerned,
namely a discussion of how alternative fitness criteria may be described for binary search
processes.
3.1.4 Fitness Criteria, Fitness Functions and Boolean Functions
Under the assumption of binary genetic elements of individual
chromosomes/minterms/designs in some search space, Boolean functions and their
decompositions into minterms or maxterms provide a theoretical basis for capturing many
important elements of evolutionary processes. I cite as particularly elegant examples the
abundant use of such functions by Stuart Kauffman in his work on self-organization and complex
adaptive systems [Kauffman, 1993, 1995, 2000]. I will now argue that fitness criteria as well as
the feasibility filters in this system can be represented in terms of Boolean functions. Having a set
of fitness criteria is, of course, not the same as having a definite mathematical function
appropriate for grading the fitness of individual chromosomes/designs in some search space. But
being able to express fitness criteria is a step toward the development of explicit fitness functions.
Using Boolean functions we can identify specific combinations of genetic elements, and their
possible interactions, that seem to contribute to fitness [an example follows below]. The
decomposition of Boolean functions in identifying fitness criteria does present several problems,
most of which concern chromosome length and the incredible sizes of the search spaces that
24
result. I will mention some of these problems a bit later. I was about to lose interest in relating
Boolean functions and fitness criteria until I read about the work on schemata in evolutionary
computation as described by Dumitrescu et al.
In discussing schemata such as those in the examples I mentioned above, Dumitrescu et
al connect schemata and fitness in a somewhat curious way. In one place, following Holland
[1975], they relate schemata to building blocks having "acceptable solutions" which, when
combined, produce larger building blocks having "good solutions" [Dumitrescu et al, 2000, 33].
Another example involves their mating strategy called partially enumerative initiation that I
mentioned earlier. This strategy concerns matings involving at least one member of schemata
assumed to be "successful". In another place, while discussing N-point crossovers, they apply the
term "successful" to chromosomes that represent schemata [2000, 109]. In these situations, the
assumption is that there is some way of grading the "acceptability", "goodness", or "success" of
particular chromosomes as the evolutionary process proceeds. As I have illustrated above, any
schemata, as well as combinations of them, can be represented as Boolean functions. Minterm
decompositions of any schemata or combination of them, represented as a Boolean function, can
be decomposed to reveal the specific chromosomes that correspond to these schemata or
combinations of them. Having mentioned the "acceptability", "goodness", or "success" of
members of schemata, Dumitrescu et al then say: "No direct computation of the schemata fitness
is made by the genetic algorithm" [2000, 37]. As far as I can tell from what they have said, the
fuzzy fitness judgments just mentioned are based on knowledge of a problem domain or perhaps
just on guesses or hypotheses about fitness. In Inventor 2000, a single criterion, overall design
weight, was employed with complete recognition of the fact that other fitness criteria are
necessary. Here are some thoughts about fitness criteria, fitness functions, and
chromosomes/designs represented as minterms.
Consider again the six-variable minterm map in Figure 1 on page 14. Suppose there to
be a specific fitness function g that can be applied to any minterm Mi on this map, and further
suppose that we have values of g(Mi) for all 64 minterms. If we had such a function we could say
that we have described the fitness landscape for this complete collection of minterms, or search
space of individual chromosomes/possible designs. There are some troubles here related to the
interpretation of such a fitness landscape that concern the properties of minterms and the various
numerical ways in which we can identify them. First, on pages 5 - 7 above I described a binary
designator method for keeping track of minterms in an orderly way, and I mentioned how binary
designators are also called bit strings. Using the binary designator or bit string for any minterm we
can convert this binary designator into a decimal equivalent, and it is this decimal equivalent that
we use to identify a minterm and to place it on a minterm map. Thus, minterm ABC cDEFc has an
associated bit string 110110, whose decimal equivalent is the number 54; so we say that minterm
ABCcDEFc = M54. A major question we have to ask is: What do the decimal equivalents assigned
to each minterm tell us about this minterm? The obvious answer seems to be that the decimal
equivalents we assign to minterms have only nominal scale properties. That is, all these numbers
do is to identify the unique event [genetic element] combination that occurs in each minterm. Here
is what these minterms numbers do not tell us. By the way, the following conclusions apply to
minterm maps, or binary search spaces, of any size.
First, two minterms [chromosomes/designs] having adjacent numbers on a minterm map
are not necessarily close in terms of their genetic states. Some are close and some are not; but
the relationships are orderly as I will explain. For example here are two adjacent minterms M 0 and
M1, where M0 = AcBcCcDcEcFc and M1 = AcBcCcDcEcF; they differ only in terms of the
complementation on one attribute [gene] F. But now consider adjacent minterms M31 and M32,
where M31 = AcBCDEF and M32 = ABcCcDcEcFc. As you see, they have completely different
genetic states; i.e. they differ in all six gene states. So adjacency of minterm numbers does not
tell us how similar the minterms are in any straightforward way. However, the relationship
between numbering adjacency and difference in genetic states is quite interesting and is shown in
Table 2 on the next page. What the table shows is how many genetic elements a minterm has in
common [C] with its immediate predecessor on the numbering scale ranging from 0 to 63. For
25
example, in Table 2 M16 has one element in common with M15 and five elements in common with
M17. One interesting fact is that every odd-numbered minterm has exactly five elements in
common with its immediate predecessor.
Here is a simple accounting of the number of minterms that have exactly k elements in
common with their immediate predecessors on a minterm map:
k
5
4
3
2
1
0
Number
32 [All the odd-numbered minterms]
16
8
4
2
1 [Only M32]
There's nothing magical about the orderliness of these results; they are a simple consequence of
our ordering of the events in a minterm or chromosome so that we can obtain decimal equivalents
for the bit strings or binary designators that identify each minterm.
Minterm Number
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
C
---5
4
5
3
5
4
5
2
5
4
5
3
5
4
5
1
5
4
5
3
5
4
5
2
5
4
Minterm Number
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
C
0
5
4
5
3
5
4
5
2
5
4
5
3
5
4
5
1
5
4
5
3
5
4
5
2
5
4
26
27
28
29
30
31
5
3
5
4
5
59
60
61
62
63
5
3
5
4
5
Table 2
What's the essential message conveyed by Table 2? I believe it is that
chromosomes/designs/minterms numbered in terms of binary designators or bit strings, however
reasonable this seems, makes it quite difficult to obtain easily interpretable and orderly fitness
landscapes in which we can talk about various fitness regions indicating many local and, perhaps,
one global fitness region. I hope the following figure helps to illustrate my concerns.
Global Max: M44
Local Max.
M0
g(Mi)
0
1
2
M40
63
Minterm Map
Figure 3.
The figure above shows a hypothetical fitness landscape associated with
chromosomes/designs/minterms that have been numbered in terms of their binary designators or
bit strings. The numbering of the minterms on the map above is the same as the numbering
shown in the figure on page 14. Associated with each minterm Mi is a numerical value g(Mi) that
indicates its fitness according to criteria that have been established. In this landscape I have
established one globally maximum value of g(Mi) and have indicated one among several values
of g(Mi) that are only local maxima. One trouble with small fitness landscapes such as this one is
that the search process would not take very long to find this global maximum, which as indicated
is at M44. Depending on how many parents we chose in our initial mating population, M(t 0), we
could easily exhaust all 64 possible chromosomes/ designs/minterms in a very few evolutionary
stages involving crossover and mutations. Successive operations of crossovers and mutations
producing new children designs would, in short order, keep coming back to the same
chromosomes/designs already discovered. This shows the limitations of my simulations based on
27
small numbers of binary variables, which I fully understand. But this simple minterm map does let
me illustrate what I take to be some important and interesting characteristics of evolutionary
"trajectories" that might be taken on much larger fitness landscapes associated with
chromosomes/designs/ minterms based on binary variables.
The first point I will illustrate using the fitness landscape in Figure 3 concerns fitness
discontinuities that can occur as the evolutionary process unfolds. We might believe that the
evolutionary process always proceeds smoothly toward higher fitness regions and that an orderly
convergence to regions of greater fitness is usually observed as a result. To illustrate how this
steady convergence may not happen, suppose two parent chromosomes/designs have been
selected for mating and that, after crossover, one of the children chromosomes is M 40 =
ABcCDcEcFc. As the above figure shows, the fitness of M40, g(M40), is quite small. But suppose
that, at the next evolutionary stage, the M40 chromosome had its gene Dc mutated to D. The result
would be chromosome M44 = ABcCDEcFc whose fitness is the global fitness maximum. Asked to
explain this drastic jump in fitness, we decide that A, C, and D interact strongly, in the absence of
B, E, and F, to increase fitness. All it took to turn M40 from a loser to a winner was a genetic
change from Dc to D. There are, of course, other routes to the chromosome M44, having the
global fitness maximum in Figure 3. One such route begins with mating parents M 34 =
ABcCcDcEFc and M12 = AcBcCDEcFc. If a single crossover point is set between the second and the
third genes, one child that will result is M44 = ABcCDEcFc, whose fitness is globally maximum. The
fitness of M34 and M12 could be any measured value less than the global maximum; only M44 has
this maximum value in the present example.
I wish to recall Stuart Kauffman's point, mentioned above on page 23, about crossovers
being useful only on "highly correlated fitness landscapes in which regions of greatest fitness
cluster together". His comment raises some interesting matters concerning fitness of
chromosomes and the genetic similarity of adjacent chromosomes/minterms I have been
examining. One issue concerns how closely fitness of chromosomes clusters around
chromosomes/minterms that are adjacent on a minterm map. In Figure 3 above I have arbitarily
assigned M44 a globally maximum value of a fitness function. Shown below in Figure 4 is the
genetic closeness of each of the chromosomes/minterms adjacent to M44. The underlined number
for each minterm indicates how many genetic elements the minterm has in common with M 44. For
example, M35 has just one element in common with M44, while M45 has five elements in common
with M44.
35
1
43
3
51
1
36
5
44
52
37
4
45
5
4
53
3
Figure 4
So, if fitness correlates with genetic similarity, then M36, M37, M45, and M52 should all have
fitness close to the fitness maximum at M44. One trouble is that genetic similarity with M44 does
not require the adjacency shown in Figure 4. For example M40 = ABcCDcEcFc, one of whose
genes I mutated to yield M44 = ABcCDEcFc, has five elements in common with M44 before
mutation. As you see, M40 is not adjacent to M44. I know that my fitness assignments in Figure 3
are perfectly arbitrary. All my present argument shows is that, if fitness and genetic similarity are
correlated, then we might not have the fitness clustering that Kauffman suggests. Another
example involves the two chromosomes, M34 = ABcCcDcEFc and M12 = AcBcCDEcFc, that I mated
28
to produce M44. M34 has just three genetic elements in common with M44, but M12 has five
elements in common with M44. Neither M34 nor M12 are adjacent to M44 on the fitness landscape.
Here is an example of a fitness criterion and how it could be expressed as a Boolean
function in the same way I expressed fitness criteria in terms of schemata above on page 22. The
criterion reads, in the six-binary variable case: "Chromosomes [designs] are fittest when we have
A, B, and C, provided that we do not have either E or F". In symbols, f(A, B, C, D, E, F) =
ABC(EF)c = ABC(Ec  Fc). From the minterm map on page 14 we see that M56, M57, M58, M60,
M61, and M62 meet this criterion. Asked to explain this criterion we might observe that the fittest
chromosomes/designs are the result of an interaction involving A, B, and C, that occurs when
either or both of E and F are absent. Further, this interaction between A, B, and C is not affected
by what state D is in. So, this fitness criterion, expressed as a Boolean function acts to define a
fitness region in the fitness landscape shown across a minterm map.
There is an interesting relation between fitness functions and composite utility functions
encountered in decision theory and analysis. As in decision tasks, evolutionary processes
produce consequences that have value or utility. Value or utility here concerns the fitness of the
consequences of evolutionary processes in the form of new chromosomes or, in the case of
Inventor 2000/2001, new engineering designs. In both situations the consequences are
multiattribute in nature. Each attribute specifies a single dimension along which the value of a
consequence can be measured. For example, in deciding which one of several houses to
purchase we consider all the following attributes and, perhaps, many others as well: A1, Cost: A2,
Location; A3, Driving distance to work; A4, Floor plan; A5, House size, and so on. For any
particular house, we assign a value [V] to each of the observed levels of any of these attributes;
in short, we have V(A1), V(A2), V(A3) and so on for each of the attributes we have identified and
for which we have measured values. In the case in which houses have n measurable attributes,
the composite value V of any house [Hi] we are considering can then be represented by
V[Hi] = f[w1V(A1), w2V(A2), ..., wiV(Ai),..., wnV(An)], where wi is an importance weight
assigned to attribute Ai, V(Ai) is the value attached to some level of attribute Ai, and f is some
real-valued function that prescribes how the importance-weighted values V(Ai) attached to each
attribute are to be aggregated or combined to produce the composite value V[Hi]. Determining
sensible identifications of function f in different circumstances is anything but easy in decision
theory and analysis. A frequently-employed simplification is to assume that f is linear, in which
case we have:
i n
V [ Hi ]   wiV ( Ai ).
i 1
Such a linear function of course ignores any of the possible dependencies or interactions
that may exist among the n value attributes. Many books and papers have been written on the
various forms function f might take that account for various forms of interactions [e.g. Keeney &
Raiffa, 1976]. When we have n attributes we have 2 n -{n+1} possible interaction patterns involving
two or more attributes. The frequently-made linearity assumption ignores any subtleties or
complexities that reside in value attribute interactions. As we know, linear models never expose
any surprises that may so often lurk in these interactions. As I will now explain, we have these
same difficulties in our attempts to define useful/reasonable fitness functions g[Mi] that grade the
overall fitness of any chromosome/design/minterm Mi. Clearly, the appearance of a fitness
landscape assigned across some universe or search space, represented by a minterm map,
depends crucially on how fitness function g[Mi] is defined.
There are some unusually difficult problems associated with grading the fitness of
chromosomes/designs when such a process is compared with value assessment in decision
analysis. Let's first see what is involved in assessing the value or utility of various levels of an
attribute of a choice consequence in decision analysis. A very common method is based on
29
grading the value/utility of an attribute on a [0, 1] scale, where 1 means highest value and 0
means lowest value. For a certain attribute a value function may be increasing if having more of
this attribute is better than having less of it. One example would of course involve money. Most
people would prefer having more money than less money. In other situations, however, having
more of something is worse than having less of it; this leads to a decreasing value function.
Driving distance to work is a good example in which we encounter decreasing value functions.
Both of these examples involve monotone functions. But there are situations encountered in
which value is a concave function involving an intermediate maximum value with a tailing off of
value on either side.
Grading the overall fitness of a design, represented as a minterm, presents difficulties
that are not encountered in grading the composite value/utility of a decision consequence.
Consider a minterm or design Mi whose overall fitness g(Mi) is to be established. I must return to
the subscript identification of the event classes involved in a minterm that I introduced earlier on
page 5. What I need to illustrate fitness function identification is a "generic" minterm whose exact
genetic elements [alleles of genes] have not yet been identified. First, suppose A =
{An-1, An-2, ..., A1, A0} is the generating class for a minterm map resulting from n binary event
n 1
classes of the form Aj =
{Aj, Ajc}.
By definition, minterm MI =
Y
j 0
j
, where each Yj is either Aj or
Ajc. The generic minterm Mi, expressed in these terms is:
Mi
Yn-1
Yn-2
.
.
.
Yj
.
.
.
Y1
Y0
Yj = Either Aj or Ajc.
g(Mi) = Overall fitness of Mi.
Next, consider individual gene Yj whose possible alleles [genetic states] are Aj and Ajc.
What we need is a function fj(Yj) that prescribes the contribution of gene Yj to overall fitness g(Mi).
We might first suppose that fj(Aj)  fj(Ajc); i.e., that the contribution to overall fitness of genetic
element or allele Aj is not the same as the contribution of Ajc. But this might not always be the
case because of possible interactions among genetic elements. For example, it might be true that
when Yk is in state Ak, then fj(Aj) = fj(Ajc). In other words, Ak renders overall fitness g(Mi) the same
whether Yj is in state Aj or Ajc.
The next consideration illustrates the basic distinction between grading the overall fitness
of a design/chromosme/minterm and the composite value/utility of a multiattribute decision
consequence. In decision theory and analysis involving multiattribute choice consequences we
assume that the value or utility associated with any attribute of this consequence is graded on the
same scale, commonly taken to be the [0, 1] interval. Employment of this grading scale is a
consequence of the celebrated preference axioms of von Neumann and Morgenstern [1946].
What they showed was that value/utility of consequences and their attributes could be graded on
a conventional probability scale. Of course its true that the form of the value functions applied to
30
different consequence attributes may be quite different. Some may be increasing, some
decreasing, and some nonmonotonic. In grading the contribution of genetic element Yj to the
overall fitness of design or minterm Mi, we will observe that the fitness contributions given by
each fj(Yj) are different for each attribute Yj. In other words, fn-1, fn-2, ..., fj, ..., f1, f0 may all be quite
different. A basic trouble is that, unlike value/utility gradation, we will probability not be able to put
all of these fitness measures on a common scale of measurement. For example, suppose that in
grading the fitness of a wind-bracing design Yj = design weight and Yk = positioning of the crossmembers of the design. We expect that fj(Yj) to be different from fk(Yk) and that their fitness is
measured on quite different scales. It might be worth investigating to see whether "fitness" can be
graded on a common scale across design attributes in the same way that value/utility can be
graded, across attributes, on a common scale. This would simplify matters a bit.
We come, finally, to perhaps the greatest difficulty in determining overall fitness function
g(Mi) for designs/chromosomes represented as minterms Mi. This difficulty concerns how we are
to aggregate or combine our individual genetic fitness contributions f j(Yj). In theory, what we need
is given by:
g(Mi) = F[fn-1(Yn-1), fn-2(Yn-2) ..., fj(Yj) , ..., f1(Y1), f0(Y0)],
where F is a rule for combining the individual fitness contributions f j(Yj). All I can say at this point
is that F is presumably non-linear. We could assume linearity, as so often done in decision
analysis, but this would invite all the problems associated with ignoring important interactions or
nonindependencies among the fitness contributions of the individual genes in designs
represented as chromosomes or minterms. In most cases, just listing all the possible interactions
among these genetic elements would be an impossible task, let alone evaluating the fitness
consequences of these interactions. Again, if there are n binary genetic elements in a
chromosome/ design, there are 2n - {n + 1} possible interactions involving two or more of these
genetic elements. Determining F, as well as determining each of the f j(Yj) requires considerable
domain knowledge. As I noted earlier, our entire representation of a fitness landscape across
some search space depends upon how we define each f j(Yj) and F. There is so much more to be
said about the task of defining these crucial features of our work involving evolutionary
computation.
3.2 Boolean Functions and Evidence Marshaling in Discovery.
My present belief is that Boolean functions and their decompositions can play an
important role during the process of discovery, especially when they are combined with the
genetically inspired evolutionary computational strategies I have briefly reviewed. I’ll begin with
the process of evaluating hypotheses.
3.2.1 Hypothesis Evaluation
My first task is simply to illustrate how hypotheses in discovery-related activities can
themselves be associated with Boolean functions. Following are some abstract assertions similar
to those made frequently and easily, in non-abstract circumstances, by scientists, engineers,
intelligence analysts, physicians, attorneys, historians, auditors and others who perform complex
discovery-related or fact investigation tasks. Suppose three persons fall to arguing about the
conditions under which hypothesis H might be true. Let’s leave aside for the moment what led to
the generation of hypothesis H in the first place.
1) First Person: "If events A, B, and C occur, or, if D and E occur but F does not, then I
would argue that hypothesis H would follow".
2) Second Person [In response to the first]: "I'll go along with your argument, provided
that event D does not occur. I don't believe H will explain D occurring in the presence of E but not
of F".
31
3) Third Person [In response to the first two]: "I think you are both wrong. For H to be
true, it does not matter whether events A, B, or C occur, all that matters is that D occurs when
neither E nor F occur".
In this simple situation we have three persons each making an assertion about
hypothesis H in the form of a Boolean function. For Person 1 the Boolean function is f 1 =
ABC  DEFc. For Person 2, the function is f2 = ABC  DcEFc. For Person 3 the function is f3 =
DEcFc. I'll note here that Person 3's assertion is in fact a schema as defined in evolutionary
computation. Another way to write this assertion as a bit string is: (* * * 100). Using the minterm
map shown above on page 14 we can list all possible binary event combinations that are
consistent with each of these three assertions. For Person 1, minterms M56 through M63 and
minterms M6, M14, M22, M30, M38, M46, and M54 are all specific event combinations that are
consistent with H. For Person 2, hypothesis H would explain minterms M56 through M63 and
minterms M2, M10, M18, M26, M34, M42, and M50. For Person 3, hypothesis H would only explain
minterms M4, M12, M20, M28, M36, M44, M52, and M60.Thus, Person 3’s assertion is the most
restrictive of the three Boolean statements about hypothesis H. For Persons 1 and 2 there are
fifteen different event combinations that could be explained by hypothesis H, though seven of the
event combinations are different for these two persons. But only eight event combinations are
explained by H according to Person 3.
A minterm decomposition of each of these three Boolean assertions allows us help settle
arguments about whose specification of H, if any among the three that are offered, agrees with
evidence obtained about these six binary classes of events; here are some examples. First,
suppose we know now that H is true but have observed M43 = ABcCDcEF. This makes all three
persons wrong in their assertions about H, since M43 does not appear in the decomposition of any
of their Boolean assertions [i.e. we need an entirely new definition of hypothesis H]. Suppose
instead that we have observed M58, when we know that H is true. This agrees with the assertions
of both Person 1 and Person 2 since M58 appears in decompositions of both of their Boolean
assertions. However, we cannot tell who is generally correct, Person 1 or Person 2; all we have
done is to rule out Person 3’ s assertion regarding hypothesis H. Evidence in the form of M 60 =
ABCDEcFc would not rule out any of the three assertions made about H, since it appears in the
decompositions of all three of them. Finally, some evidence combinations would favor just one
hypothesis assertion among the three we have considered. For example, M 38 would only favor
Person 1’s assertion, M26 would only favor Person 2’s assertion, and M28 would only favor Person
3’s assertion.
3.2.2 Hypothesis Generation
Boolean functions and their decompositions can play other roles besides helping to
evaluate the suitability of hypothesis statements. They also appear in the generation or discovery
of new hypotheses, which I will now explain. This role brings Boolean functions in contact with the
evolutionary computational approach we are using, as well as with the evidence marshaling tasks
studied by Carl Hunt. In the examples just provided we were considering some hypothesis H and
what it might explain, but we gave no consideration to the manner in which this hypothesis was
first generated; perhaps it was just a guess on someone’s part. But I now wish to consider the
generation of hypotheses from evidence we are gathering and how Boolean functions arise
during this process. As I proceed, I will present some thoughts about how evolutionary
computational strategies can be employed during these important discovery-related tasks. One
place to begin is with Sherlock Holmes and his views about the process of discovery. Along the
way I will combine his thoughts with those of John Henry Wigmore, the celebrated American
evidence scholar, whose writings on evidence in law I have so often plundered during the past 35
years.
In The Boscombe Valley Mystery, Holmes tells his colleague Dr. Watson: "You know my
method. It is founded on the observance of trifles". The trifles to which Holmes was referring
consist of any form of observation, datum, or detail that may later become evidence in some
32
inference task when its relevance is established. Evidence is relevant if it bears in some way on
hypotheses already being entertained or if it allows the generation of a new hypothesis. Trifles
may be tangible in nature in the form of objects, passages in documents, details in sensor
images, etc; or they may be obtained from the parsing of the testimony from human sources. In
every case, however, evidence reveals the possible occurrence of certain events whose actual
occurrence would be important in inductive or probabilistic reasoning tasks. It is here that
Wigmore's ideas become very important. In his Science of Judicial Proof [Wigmore, 1937]
Wigmore advises us to distinguish between evidence of an event and the event's actually
occurring. Evidence of an event and the event itself are not the same. Just because we have
evidence that event E occurred does not entail that it did occur. The source of this evidence,
including our own observations, might not be perfectly credible. So, we must distinguish between
evidence E* and event E itself; from evidence E* we can only infer the occurrence of event E.
As we make observations and ask questions [perhaps the most important element of
discovery], trifles begin to emerge, often at an astonishing rate. From an emerging base of trifles
we begin to generate ideas in the form of hypotheses concerning possible explanations for the
trifles we are observing. Here we encounter search problems having the same complexity as the
ones encountered in Inventor 2000/2001. Figure 5 illustrates a major problem we face in
generating new hypotheses in productive and efficient ways.
A Trifle Base
Hj
Hi
Figure 5
On occasion we might get lucky and be able to generate a plausible hypothesis from just
as single trifle. In Figure 5, just a single trifle allowed us to generate hypothesis H i as a possibility.
As an example, the finding of a fingerprint, a DNA trace, or a footprint might allow us to identify a
particular suspect in a criminal investigation. More commonly, however, new hypotheses arise as
we consider combinations of trifles, details, or data. In Figure 5 a new hypothesis H j is suggested
by bringing together, marshaling, or colligating [Charles S. Peirce's term] several trifles that seem
to be related in some way. Here is where the trouble starts. Looking through all possible
combinations of trifles, in the hope of finding interesting ones, is as foolish as it is impossible. The
number of possible combinations of two or more trifles is exponential with the number of trifles we
have. With n trifles, we have 2n -{n+1} possible trifle combinations. The question is: How do we
decide which trifle combinations to examine and that might be suggestive of new hypotheses that
can be taken seriously?
I add here, parenthetically, that the events of September 11, 2001 have occurred since I
wrote the first version of this paper. Since these events, we have all heard the phrase:
"connecting the dots" and how our intelligence services have not been so good at this task. What
Sherlock Holmes referred to as "trifles" are now commonly referred to as "dots". There is nothing
simple about the task of "connecting the dots" in any situation whether in intelligence analysis,
medicine, history, law, or whatever.
33
In our work extending over nearly ten years, Peter Tillers and I studied a variety of
evidence marshaling strategies designed to enhance the process of generating new hypotheses
from combinations of trifles [or "dots"]. Different evidence marshaling strategies are necessary at
different points during an episode of discovery. We identified fifteen different evidence marshaling
strategies and showed how they could be implemented in a prototype computer system we called
Marshalplan. A review of this work appears elsewhere [Schum, 1999]. Each marshaling operation
we identified plays the role of a metaphoric magnet or attractor for bringing together trifles that,
taken together, may suggest new hypotheses or new lines of investigation. Carl Hunt extended
this work considerably by showing, in his doctoral dissertation [2001], how trifles [potential items
of evidence] could be made to self-organize and to suggest scenarios leading to new hypotheses
and new lines of investigation.
A few words are necessary about hypotheses and their mutation or revision as discovery
proceeds. Hypotheses in many areas are generated from marshaled collections of thoughts and
evidence regarding events that happen over time. Taken together and ordered temporally, these
collections of thoughts and evidence begin to resemble scenarios, stories, or possible complex
explanations. We can in fact represent these scenarios as minterms, or chromosomes, provided
that we restrict our attention to binary events. But we need to expand the kinds of binary genetic
elements involved in a scenario considered as a minterm or chromosome. There are three
classes of binary elements [genetic states] we need to consider in the construction of a scenario,
story, or narrative account of some emerging phenomenon for which we are seeking an
explanation or hypothesis.
In some cases we will have specific evidence A*, that event A occurred. This is called
positive evidence; so-named because it records the occurrence of an event. But we might instead
have received evidence Ac*, that event A did not occur. Evidence of the nonoccurrence of an
event is called negative evidence. So, one class of events we must consider is the binary
evidential class {A*, Ac*}. In some rare instances, we might be willing to say that we know for sure
that event A occurred. Knowing for sure that event A occurred means that the source of evidence
about this event is unimpeachable. We might instead “know” that event A did not occur [A c]. So,
another possible binary event class is {A, Ac}. Finally, in order to fill in gaps left by evidence we do
not have, or by lack of any knowledge about events, we often insert hypothetical events that are
also called gap-fillers. Usually, these gap-fillers are based on guesses, hunches, or upon past
experience. Here is the major heuristic value of constructing scenarios during discovery and
investigation. Each gap-filler we identify in order to construct a scenario or story that “hangs
together” opens up a new line of investigation. Let a = a gap-filler or hypothetical saying that
event A might have occurred. Then , let ac = a gap-filler saying that event A might not have
occured. Thus, the binary class {a, ac} represents gap-fillers or hypotheticals indicating the
guessed or inferred occurrence or nonoccurrence of event A. All stories or scenarios are mixtures
of fact and fancy. The fanciful elements of our scenarios consist of these gap-fillers or
hypotheticals. I add here that, when I speak of marshaling thoughts and evidence, at least some
of these thoughts may be the gap-fillers we introduce in scenarios. They do in fact represent
potential items of evidence we might collect.
Here, in symbolic form, is what a scenario might look like when cast in terms of the three
binary classes just identified: {A*, Ac*}, {A, Ac}, and {a, ac}. First, suppose our scenario concerns
events A, B, C, D, E, and F. We have evidence A* and C* that events A and C occurred, and we
have evidence Ec* that event E did not occur. Having no evidence [yet] about whether event B
occurred, we insert gap filler b to link together evidence items A* and C*. We insert another gapfiller dc as a guess that event D did not occur. Finally, suppose we are willing to believe with
perfect confidence that event F occurred. In a homicide investigation, for example, F may
represent the event that victim V was killed. We know V was killed because we are presently
looking at V's corpse on a slab [V was identified by his wife]. So, in minterm or chromosome form,
our scenario can be represented by: (A*bC*dcEc*F). This means that we have the following six
34
generating classes of events: {A*, Ac*}, {b, bc}, {C*, Cc*}, {d, dc}, {E*, Ec*}, and {F, Fc}. We can
still employ the binary designator or bit string method to keep track of the 64 possible minterms
representing possible variations in the scenario being constructed. In the present example we
have the bit string (111001). So, our current minterm (A*bC*dcEc*F) = M57, using the six variable
binary designator system shown in Figure 1 on page 14.
Variations in our emerging scenario or story may take place for any number of reasons.
Some of these variations will occur that involve the six classes of binary events suggested by our
scenario M57. As time passes and we gather new evidence and have new ideas, we will of course
need to add new classes of events representing new evidence, new gap-fillers, and possibly new
known events. In short our minterm map will naturally grow larger. In complex situations the
number of possible scenarios or stories we can tell will begin to approach the size of the search
space in Inventor 2000.
Using just the six generating classes shown above, here are some ways in which our
scenarios or stories might change. Any changes may well suggest new hypotheses, some of
which may be quite interesting and perhaps even more valuable than the story we are currently
telling on the basis of (A*bC*dcEc*F) = M57. First, we might be interested in seeing whether a
story would make sense, and suggest a new hypothesis, if we changed gap-filler b to bc and/or
changed gap-filler dc to d. This brings to mind the mutation operations in evolutionary
computation. Other scenario revisions may have a basis in the credibility of the sources of our
evidence. For example, we had a source who/that reported the occurrence of event A [this report
we labeled A*]. Suppose we now have reason to believe that this source’s credibility is suspect
and possibly have another source that reports Ac*, that event A did not occur. Or we might
instead wish to examine how our story might change if the original source had reported Ac* rather
than A*. Finally, we must be prepared to change our minds about whether we really “know” that a
certain event occurred. For example, as described above, we let “known” event F = Victim V was
killed. Victim V was positively identified by a woman who identified herself as V’s wife. What we
know for sure is that we have a dead person on our hands. However, we might discover that this
woman is not V’s wife after all. Can we still be sure that this dead person is V and not someone
else? Thus, we might consider how our story would change if we changed event F to event F c.
Our scenarios or stories might be revised for other reasons that will involve changes in
the ingredients of their minterm representations. For example, (A*bC*dcEc*F) = M57 is based on
two gap-fillers b and dc. These gap-fillers, as mentioned, open up new lines of investigation and
so we begin the search for evidence about these events that now have just hypothetical status.
Suppose we find credible evidence that events B and D did in fact occur [i.e. we have evidence
B* and D*]. In our guesses, we were apparently correct to guess b, but incorrect to guess dc. So
now our story, in minterm form, looks like: (A*B*C*D*Ec*F). We still have 64 possible alternative
scenarios except that we now have five evidence classes and one class representing “known”
events. Our binary designator in this example numbering will also change because we altered the
complementation pattern [i.e. we went from dc to D*]. So, our new scenario (A*B*C*D*Ec*F) = M61
[its new bit string is (111101)]. The point is that we can always keep track of possible scenarios in
an orderly way, provided that we make appropriate adjustments in the manner in which we
identify the generating classes for a minterm map.
I hope one point emerging from the above discussion of minterm representations of
scenarios or stories is that we still have a search problem on our hands. It is just possible that the
methods of evolutionary computation can assist us in generating “fitter” explanations or
hypotheses. Just because we construct a scenario or tell a story does not mean that all of its
ingredients are as we suppose them to be. In Carl Hunt’s doctoral dissertation work, his ABEM
system generated “histories” that could be “reversed parsed” and converted into scenarios [Hunt,
2001]. These scenarios, in turn, allow the user to generate hypotheses to explain the events
being observed. Carl’s work made no use of evolutionary computation but I am going to suggest
35
that such methods might be very useful in generating alternative scenarios or stories that may
suggest alternative hypotheses, some of which may be more interesting and productive than
ones we initially entertain. I have already described how we might “mutate” the ingredients of a
scenario, expressed as a minterm or chromosome. My next task is to illustrate how we might
recombine, via crossovers, the ingredients of two “parent” scenarios to produce, as children,
entirely new scenarios or stories. These new scenarios may in turn suggest possibilities or
hypotheses that are “fitter” than ones we may earlier have entertained.
Discovery or investigation rests on the process of inquiry, the asking of questions. Such
questions themselves can become “magnets” or “attractors” for extracting from an emerging trifle
base interesting combinations of trifles that may suggest new and more valuable [fitter]
hypotheses. We understand that, at present, there are no computers that can, by themselves,
generate hypotheses from collections of thoughts and evidence. But they can certainly be made
to assist the persons whose intelligence, experience, and awareness does allow them to
generate hypotheses or possible explanations. Hypotheses themselves can serve as “magnets”
or attractors in bringing together or colligating combinations of trifles. Some of these trifles may
become evidence favoring the hypothesis that attracted them, others may become evidence that
disfavors this hypothesis. Following is one example of how the recombination process in
evolutionary computation might be very useful in generating and exploring new combinations of
trifles.
The example I have chosen involves two scenarios represented as minterms or
chromosomes that may each have been generated by a question. In some cases, of course, the
questions we ask may be related in some way and so we might expect them to attract at least
some of the same trifles. Suppose the first question [Q 1] attracts the evidential trifles A*, C*, Ec*,
and F*. The person asking this question believes this combination of trifles suggests a scenario
[S1] and inserts gap-fillers b and d to make the emerging scenario tell a more connected story.
Arranged in temporal order, the elements of this scenario, expressed in minterm form, are as
follows: (A*bC*dEc*F*). A second question [Q2] is asked, possibly though not necessarily by the
same person who asked Q1. This second question attracts evidential trifles A*, C*, K* and L* that
together suggest a different scenario [S2]. To make this scenario more coherent the person
inserts gap-fillers g and j. Arranged temporally, a minterm representation of this scenario is:
(A*gC*jK*L*). As you observe trifles, A* and C* appear in both of these scenarios. Notice,
however, that the events recorded in A* and C* are linked in different ways in these two
scenarios. In S1 they are linked by gap-filler b and in S2 they are linked by gap-filler g. Also notice
that each of these scenarios suggests a different six-variable minterm map, each one showing
possible variations of each of these two scenarios. The difference is that S 1 concerns events A, B,
C, D, E and F, but S2 concerns events A, G, C, J, K, and L.
The two minterm maps just described show two ways of varying our scenarios or stories;
but there is a third way that involves crossovers among the two scenarios. As shown below,
suppose we mate our two scenarios and set a single crossover point at the third position. The
result is:
S1
S2
S3
S4
A*
A*
A*
A*
b
g
b
g
C*
C*
C*
C*
d
j
j
d
Ec*
F*
K*
L*
K*
L*
Ec*
F*
36
By mating S1 and S2 we have produced two new scenarios, as children, S3 and S4. It may
happen that either S3 or S4 suggest entirely new hypotheses no one would have thought of from
S1, S2, or any of their possible variations. So, what the crossover operation has done here is to
suggest entirely different stories, all of which will be based on existing thoughts and evidence. A
final thought here is that both S3 and S4 generate new and different minterm maps, each of which
will provide possible variations in these two new scenarios or stories. Perhaps one of these
revisions of either S3 or S4 will be even “fitter” than S3 or S4. It is probably well past time for me to
take on the task of trying to state what fitness means when this term is applied to new hypotheses
being generated.
As mentioned earlier, in the generation of new engineering designs it is possible, though
usually difficult, to develop real-valued multivariate fitness functions that grade the overall fitness
of designs being generated by evolutionary computations. The degree of fitness of new
hypotheses certainly raises some interesting and important issues, only some that have, to my
present knowledge, been addressed. The first issue concerns the process of discovery itself and
the nature of the hypotheses we generate or discover. Basically, all of my discussion of Boolean
functions and minterms in the generation of hypotheses has involved possible adjuncts to the
abductive reasoning process by which new hypotheses arise. In particular, the possible
applications of evolutionary computation to scenario and hypothesis generation that I just
described can be thought of as ways to assist people in performing imaginative or abductive
reasoning. This form of reasoning is, according to Charles S. Peirce, associated with the
generation of new ideas in the form of hypotheses or possible explanations of phenomena of
interest to us. As we know, deductive reasoning shows that something is necessary and inductive
reasoning shows that something is probable. On most accounts new ideas are not generated by
either deductive or inductive reasoning. But abductive reasoning only shows that something is
possible. Grading the fitness of new hypotheses in terms of probability, at least ordinary
Kolmogorov probabilities, does not seem sensible since, during discovery, we may not have any
disjoint and exhaustive hypotheses. In addition, as I will mention a bit later, our hypotheses may
easily mutate or change, or be entirely eliminated, as discovery lurches forward.
Because abductive reasoning just generates hypotheses that are possible, perhaps we
might consider grading them in terms of their possibility. I know of one person who has carefully
distinguished between possibility and probability, namely the British economist G. L. S. Shackle.
In his work Decision, Order, and Time in Human Affairs [1968], he offers a theory of potential
surprise according to which we might grade the possibility of hypotheses. Clearly, possibility and
probability are not the same. Some hypothesis, certainly very possible, might have very low
probability in light of present evidence. The distinct possibility that you have a certain disease
worries you. But after an extensive series of diagnostic tests your physician tells you to stop
worrying since the results of not one of these tests shows any likeliness of your having this
disease. Shackle’s theory of potential surprise is quite interesting but I have not yet examined
whether its requirements could be met during discovery in which our hypotheses may suffer
continual mutation, change, or elimination. I now address these matters.
In many situations hypotheses we generate are initially vague, imprecise, or
undifferentiated. For example, in a criminal investigation we may first entertain the hypothesis H 0,
that the victim’s death was the result of criminal act. Our first evidence is that the killer was male
[A*]. So, H0 now reads: “The victim was killed by a criminal act committed by a male” [only slightly
more specific]. New evidence suggests that the killer was also under the age of 30 [B*]. H0 now
is: (A*B*), that the killer was male and under the age of 30. We next guess that the killer was lefthanded [c]; so now H0 = (A*B*c). Further evidence suggest that the killer was known to the victim
{D*]. Now H0 becomes: (A*B*cD*) and begins to resemble a scenario or story. H0 would now
read: “The victim’s death was the result of a criminal act performed by a male under the age of 30
37
who was known to the victim and who was possibly left-handed”. In Glenn Shafer’s terms [e.g.
1976, 115-121], what we have done here is to refine hypothesis H0 by incorporating new
evidence in it to make it more refined or specific and less vague. Later very credible evidence that
the killer was female might cause us to eliminate H0 altogether. We might be well-advised,
however, to keep track of all the reasons why we chose to eliminate H 0 [at least tentatively]; this
gives us some protection against hindsight critics who will chastise us if it turns out that H 0
contained truth after all. Shafer’s system of belief functions does allow us to assign numbers to
hypotheses that mutate or change. It may be useful to examine this system carefully in
connection with discovery-related tasks.
Speaking of truth, it might be argued that the obvious way to grade the fitness of
generated hypotheses is the degree to which they seem to contain “truth”. The global maximum
on the fitness landscape across all hypotheses that could be considered would be the hypothesis
that is truthful in all respects; i.e. no other hypothesis could possibly offer a better explanation of
the phenomena of interest. There are many troubles with this prescription, perhaps the most
obvious being that we may never be able to tell the extent to which any hypothesis contains “the
whole truth and nothing but the truth”. Twelve jurors reached the verdict, beyond reasonable
doubt, that Nicola Sacco and Bartolomeo Vanzetti were guilty of first-degree felony murder in the
slaying of a payroll guard named Allesandro Berardelli on April 15, 1920. Did the jurors reach
“truth” in their verdict? This question still arouses great controversy today as Jay Kadane and I
discovered in our probabilistic analysis of the Sacco and Vanzetti evidence [1996]. The point is
that in so many situations there will never be any “gold standard” or ground truth against which to
evaluate hypotheses generated during discovery.
So, what are we left with as possible ways to assess the “fitness” of new hypotheses we
generate during discovery? Two possible suggestions known to me at present come from the
works of the logicians Jaakko and Merrill Hintikka [e.g. 1983, 1992, 1991] and the philosopher
Isaac Levy [e.g. 1983, 1984, 1991]. The Hintikka’s propose an interrogative model for discovery
in which a game is being played against nature. At any step in this game the player’s have a
choice about whether to deduce what is or has happened from acquired knowledge or to ask a
new question of nature. These new questions are graded by the extent to which they are
strategically important in the sense that they open up new productive lines of investigation. This
eliminates problems associated with trying to define and identify what are the “right” questions. At
any stage of discovery, only clairvoyance would allow us to determine exactly what questions we
should ask at this stage. So, one way to assess the fitness of hypotheses seems to be in terms of
their judged strategic importance: Will this new possibility we are considering allow us to further
the investigation in productive ways? In many episodes of discovery this might not be such an
easy question to answer. The complex and especially nonlinear nature of the world about us is
always full of surprises.
In Levy’s works he attends, among other things, to distinctions between the various forms
of reasoning I mentioned above. Specifically, he notes current arguments that discovery involves
induction rather than abduction. In his own arguments Levy notes that induction, involving the
justification or testing of hypotheses, we already have existing hypotheses that have been
generated somehow and that are taken seriously. In abductive reasoning, however, our only
claim is that some hypothesis is possible; i.e., we have not yet accepted this hypothesis into our
corpus or body of knowledge. Levy argues that one way of grading the suitability or fitness of a
new hypothesis is to test its informational virtue; i.e. to see what new phenomena [potential items
of evidence] this new hypothesis suggests. This new hypothesis will be useful to the extent that it
allows us to open new lines of evidence and to generate other new possibilities.
In closing here I simply mention that discovery-related processes are not sufficiently well
understood so that we can easily describe how they may be assisted in various ways. Attempts to
contrive a logic of discovery have not met with any discernable success. It has been said that the
human brain is the very cathedral of complexity in the known universe [Coveny & Highfield, 1995,
279]. As far as I can tell the most interesting and complex "services" taking place in this cathedral
38
concerns how we generate new ideas in the form of hypotheses or possible explanations for
events of interest to us.
3.3 Boolean Functions, Ksat, and Phase Transitions in Discovery
I hope I have given adequate evidence that, for binary events, hypotheses can be
expressed as Boolean functions. I have given two examples. The first involved expressing
hypotheses about design fitness criteria as Boolean functions; I gave an example above on page
27. My second example [page 31-32] involved any investigation in which alternative conjunctive
event combinations [scenarios] may be associated with some hypothesis. Further, as my
comments on the minterm and maxterm representations of any Boolean function illustrate, we
can decompose such functions to determine the exact number of specific ways in which any
Boolean function can be satisfied. Thus, in the case of engineering designs, we can in theory at
least determine how many specific designs, expressed as minterms, satisfy given fitness criteria.
For prodigiously large design search spaces, such as those encountered in Inventor 2000, we will
not be able to list all the possible ways in which some fitness criteria can be satisfied. In the case
of other investigations we can use these decomposition methods to help settle arguments about
which combinations of evidence [i.e. scenarios] some hypothesis best explains. Since all
discovery involves the generation of hypotheses, I have always been interested in ways for
determining alternative specific ways in which some hypothesis might be satisfied.
How pleased I was recently to discover that others have been deeply concerned about
Boolean functions and the satisfiability of hypotheses in various contexts. While reading Stuart
Kauffman’s new book Investigations [2000] I was greatly interested in his discussion of Boolean
functions and what he has termed the Ksat problem [Kauffman, 2000, 192 - 194]. The term Ksat
is shorthand for K-satisfiability. Associated with this Ksat problem is a very interesting phase
transition that, I believe, has very important implications for hypothesis generation and testing in
any context. The following comments, if nothing else, supply a different formulation for examining
Ksat problems. What I have done is to take an example Boolean function that Kauffman presents
and express this same function in two formally equivalent ways using the minterm and maxterm
expansions I have discussed. In the process I hope to add a bit to the discussion of the Ksat
problem and its consequences.
Kauffman begins by saying [2000, page 192]: “Any Boolean function can be expressed in
‘normal disjunctive form’, an example is (A1 or A2) and (A3 or A4) and (not A1 or not A4)”. Sloth
overtakes me just now and so I will eliminate the subscript terminology here and let A1 = A, A2 =
B, A3 = C and A4 = D; I will also suppress the intersection symbol according to the convention I
mentioned above on page 3. With these revisions, Kauffman’s Boolean function can be written
as: f(A, B, C, D) = (A  B)(C  D)(Ac  Dc). Before I express this function in two different ways, I
need to say a bit about terminology. It seems that there is some disagreement among
mathematicians about what to call Boolean statements involving disjunctions of conjunctions or
conjunctions of disjunctions. For example, my favorite mathematics dictionary [Borowski &
Borwein, 1991] defines a disjunctive normal form as a disjunction of conjunctions and a
conjunctive normal form as a conjunction of disjunctions. All Boolean functions have terms in
parentheses that are themselves connected by either disjunction or conjunction. The
parenthesized terms [Kauffman calls them clauses] in turn are connected disjunctively or
conjunctively. The definitions I have just given focus on how clauses are connected and not on
how the events in a clause are connected. However, I agree with Kauffman’s interpretation since
it corresponds with how I have described my minterm and maxterm expansions of Boolean
functions. We both focus on how the events within a parenthesis are connected. Minterms involve
events connected conjunctively and maxterms involve events connected disjunctively. I will now
express Kauffman’s Boolean function in two different ways, each of which provides additional
information about this function and sets the stage for my discussion of the Ksat problem.
3.3.1 Kauffman’s Example in Conjunctive Canonical Form [Minterms]:
39
Using the minterm expansion theorem, together with the method I described above on
page 8, we can express Kauffman’s f(A, B, C, D) = (A  B)(C  D)(Ac  Dc) as (AcBCcD) 
(AcBCDc)  (AcBCD)  (ABcCDc)  (ABCDc). Using the binary designator or bit string method for
numbering minterms, we can also express f(A, B, C, D) here as M5  M6  M7  M10  M14 [see
Figure 6 below]. Remember that the symbol  means “disjoint union”. So, Kauffman’s Boolean
function f(A, B, C, D) = (A  B)(C  D)(Ac  Dc) can be satisfied in any of five specific ways as
indicated by these five minterms. Remember that minterms represent the finest grain
decomposition of a basic space of outcomes that is allowed by the binary nature of the events in
Boolean functions. Thus, if Kauffman’s Boolean function was associated with some hypothesis H,
the five minterms just identified show the exact number of ways that H could be satisfied. The five
minterms listed here might each, with appropriate event labeling, be a possible scenario that
satisfies H. In the engineering design context, H might be some statement of fitness and the five
minterms represent the five specific designs that will satisfy H.
Now, in Kauffman’s example, we have V = four binary variables. As mentioned earlier,
we thus have 24 = 16 possible minterms shown in the diagram below. Thus, we also have 2 16 =
65,536 possible Boolean functions in this four-variable case. This simply tells us the total number
of hypotheses that are possible concerning the four binary variables in this case.
Ac
Bc
A
Bc
B
B
Dc
M0
M4
M8
M12
D
M1
M5
M9
M13
Dc
M2
M6
M10
M14
D
M3
M7
M11
M15
Cc
C
Figure 6
In pictorial form, Figure 6 shows the conjunctive satisfiability of Kauffman’s Boolean function. Five
of the sixteen possible minterms will each satisfy this Boolean function.
3.3.2 Kauffman’s Example in Disjunctive Canonical Form [Maxterms]
As I discussed earlier [pages 9-12], there is a formally equivalent way of expressing
Boolean functions, such as the one Kauffman describes, in disjunctive canonical form in terms of
maxterms. I will use both the De Morgan and the Gregg methods to produce a canonical
decomposition of Kauffman’s Boolean function f(A, B, C, D) = (A  B)(C  D)(Ac  Dc). I use
both methods for two reasons. First, they are both informative, but in different ways. Second, they
provide a check on my Boolean manipulations. I should of course get the same answer using
both methods.
I first begin with the minterm decomposition of f(A, B, C, D) = (A  B)(C  D)(Ac  Dc),
which I claimed was f(A, B, C, D) = M5  M6  M7  M10  M14. We first note that [M5  M6  M7
40
 M10  M14] = [M0  M1  M2  M3  M4  M8  M9  M11  M12  M13  M15]C. If we apply de
Morgan’s law twice to this complemented term in the right-hand expression, we can express in
disjunctive canonical form Kauffman’s Boolean function
f(A, B, C, D) = (A  B)(C  D)(Ac  Dc) =
(A  B  C  D) (A  B  C  Dc) (A  B  Cc  D) (A  B  Cc  Dc) (A  Bc  C  D)
(Ac  B  C  D) (Ac  B  C  Dc) (Ac  B  Cc  Dc) (Ac  Bc  C  D)
(Ac  Bc  C  Dc) (Ac  Bc  Cc  Dc).
The first thing to note here is that Kauffman’s Boolean function, expressed in disjunctive
canonical form, involves eleven disjunctive maxterms, all of which are combined conjunctively. In
other words, we must have all of these eleven maxterms taken together to satisfy f(A, B, C, D) =
(A  B)(C  D)(Ac  Dc). Now I will employ Gregg’s method for disjunctive decomposition that I
described above on pages 10 -11.
_____________________________________________________________________________
Row A B C D (A  B) (C  D) (Ac  Dc) (A  B)(C  D)(Ac  Dc) Maxterm
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
0
1
1
1
0
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
0
1
0
0
0
0
0
0
1
1
1
0
0
1
0
0
0
1
0
(A
(A
(A
(A
(A
 B  C  D)
 B  C  Dc)
 B  Cc  D)
 B  Cc  Dc)
 Bc  C  D)
---------------------------(Ac  B  C  D)
(A c  B  C  Dc)
---------(A c  B  Cc  Dc)
(A c  Bc  C  D)
(A c  Bc  C  Dc)
---------(Ac  Bc  Cc  Dc)
----------------------------------------------------------------------------------------------------------------------------- ---It appears that I have performed my de Morgan operations appropriately since my lists of
maxterms generated by the de Morgan and the Gregg methods agree. The table above helps me
to explain a major distinction between Kauffman’s original Boolean function and my maxterm
expansion of it. This distinction involves the number of variables in the clauses of our functions. In
Kauffman’s formulation, all clauses have just two binary variables. But in both my minterm and
maxterm expansions, all clauses have all four binary variables. My method requires that this be
so. Indeed, the definitions of a minterm [page 5] and a maxterm [page 10] require that both of
these terms contain states of all Boolean variables in the function decomposition mentioned
above. This will be an important point to keep in mind as we proceed. My minterms and
maxterms, as clauses, will always have a number of variables in clauses [C] that is equal to the
number of variables [V].
Now its time to compare the minterm and maxterm expansions of Kauffman’s Boolean
function to see what they reveal. First, only the minterm expansion reveals the specific bit strings
41
that satisfy this Boolean function. You can see this in the table above. Only the bit strings in Rows
5, 6, 7, 10, and 14 correspond to the truth of, or satisfy, Kauffman’s f(A, B, C, D) =
(A  B)(C  D)(Ac  Dc). This is the same thing as saying that the only conjunctive minterms
that satisfy this function are M5, M6, M7, M10, and M14. Far less informative is the equivalent
statement that this Boolean function is also satisfied by the intersection of all the eleven
disjunctive maxterms that are derived from the remaining eleven bit strings. Thus, to satisfy
Kauffman’s Boolean function we must have one or the other of the five minterms I have identified.
Or, equivalently, we must have all of the maxterms I have identified. The reason, as you see in
the table above, is that no one of the individual bit strings in Rows 0, 1, 2, 3, 4, 8, 9, 11, 12, 13,
and 15 corresponds with the truth of, or satisfies, f(A, B, C, D) = (A  B)(C  D)(Ac  Dc).
Observe in the last column of the table above that a maxterm associated with any of these eleven
rows has a different complementation pattern than does the bit string with which it is associated.
3.3.3 Ksat and Phase Transitions.
I can now apply what I have done so far to Kauffman’s very interesting discussion of
phase transitions that occur as far as the satisfiability of Boolean functions is concerned
[Kauffman, 2000, 192-194]. At first, I will restrict attention to my maxterm expansion of
Kauffman’s original Boolean function. To set the stage for my comments on Ksat problems, lets
review the three essential ingredients of the Kauffman [hereafter Stuart] and Schum [hereafter
Dave] formulations of Stuart’s Boolean function.
Variables [V] Clauses [C] Variables In Each Clause [K]
4
3
2
4
11
4
Stuart:
Dave [Maxterms]:
Stuart first draws upon the work of a physicist named Scott Kirkpatrick who has studied
Boolean functions and the extent to which they might be satisfied [unfortuately Stuart does not
supply a reference to Kirkpatrick’s work]. What Kirkpatrick discovered was the importance of the
ratio between the number of clauses [C] in a Boolean function, and the number of variables [V]
that this function involves. As C gets larger than V a point is eventually reached at which the
probability of satisfying the Boolean function drops suddenly and precipitously; corresponding to a
phase transition. What is remarkable about Kirkpatrick’s work is that his studies revealed that a
specific point on the C/V ratio could be determined where this phase transition will occur and that
this critical ratio of C/V = R depends only on K, the number of variables in each clause. He
showed that R = ln2(2K) = 0.6931(2K). Here’s where the term Ksat comes from; this critical ratio R
depends only on K. I note here that Kirkpatrick’s formulation assumes that there will always be
the same number of variables in each of the disjunctive clauses in a Boolean function. In forming
any Boolean function there’s no requirement that all clauses have the same number of variables.
For example, we might be interested in the satisfiability of the Boolean function g(A, B, C, D) =
(A  B)(B  Cc  D)(A  Dc). In this case K is not constant over the three disjunctive clauses in
the Boolean function. Figure 7 shows the nature of this phase transition.
1.0
Ln2(2K)
Probability of
Satisfiability
0
42
C/V
Figure 7
I believe that this very interesting result can give a formal expression to at least one
interpretation of Occam’s Razor. The number of clauses in a Boolean function basically identifies
the number of constraints imposed in satisfying this function. If the number of these constraints
gets to be many times larger than the number of variables there are, the chances of finding a
result that satisfies this function decreases, and precipitously so; this is what’s so interesting
about Kirkpatrick’s result. Let me return for a moment to hypotheses expressed as Boolean
functions. What this phase transition says is that a point will be reached at which the specificity of
our hypothesis will suddenly outrun our ability to find any scenario that satisfies it. Here’s another
way, I believe, to interpret Occam’s Razor. Detailed hypotheses are often necessary in many
situations. However, when they become too detailed we may never be able to find combinations
of evidence that are consistent with these hypotheses.
I spent a bit of time exploring Stuart’s Boolean function and my maxterm representation
of it as they relate to this critical ratio R = ln2(2K). I thought this might be interesting since the
value of K is different in our two formally equivalent expressions; so is the ratio [C/V] in our two
expressions. For Stuart, K = 2; for me K = 4 = V. In any maxterm expansion K must equal V. In
Stuart’s expression the actual value of C/V = 3/4; in my maxterm expression, actual C/V = 11/4 =
2.75. I next calculated the critical value of R [for phase transition] in each of our expressions. For
Stuart, R = ln2(22) = 2.772588722. For my maxterm expression, R = ln2(24) = 11.09035489 [I
carry all these decimals here for a reason I’ll mention in just a minute]. Both of our actual C/V
ratios fall short of critical R required for a phase transition. I think the reasons differ, however,
which is another point I’ll address a bit later.
Next, using the critical value of ratio C/V = R = ln2(2K), and knowing that in both Stuart’s
and my formulation V = 4, I wondered how large Stuart’s C could be made before he encounters
a phase transition. In Stuart’s case, C = (4)R = (4)(2.772588722) = 11.09035488, which is about
as close as it can come to the value of the critical ratio in my maxterm formulation of his Boolean
function. In this case, our actual C/V ratios would be the same, namely 11/4. Calculation of C for
my maxterm expression produces sort of an anomaly, which I believe I can explain. For Dave, C
= 4R = (4)(11.09035489) = 44.36141956. This seems a preposterously large number of clauses I
could add to my maxterm expansion before I fall off the phase transition cliff. What it does say,
however, is that I stand virtually no chance of doing so. I believe the reason is that my minterm
and maxterm expansions simply illustrate all the ways in which Stuart’s original Boolean function
can be satisfied. My minterm expansion shows that there are just five specific ways in which fourvariable conjunctions will satisfy his function and my maxterm expansion shows that there are
exactly eleven four-variable disjunctions, all of which must be true to satisfy his function. I add
here that this conjunction of eleven disjunctions is immediately derivable from the five specific
conjunctive minterms.
In closing, I add only that discussions of these very interesting Ksat problems seem
incomplete without discussion of the two major ways in which any Boolean function can be
expanded in terms of minterms and maxterms. Each of these expansions provides information
that lurks within Boolean functions but is not exposed until these expansions are performed.
Minterm and maxterm expansions give a complete account of the satisfiability of a Boolean
function. It is usually not easy to tell just by examining an original Boolean function whether it is
satisfied by any particular settings of its event ingredients. As Stuart mentions, some setting of
the ingredients of Boolean functions may lead to contradictions. Fortunately, these possibilities
are all eliminated in the procedure for generating a minterm expansion of a Boolean function [see
Step 3 on page 8]. The minterm and maxterm expansions I have provided for Stuart's Boolean
function are simply examples of how we can rather easily [at least for relatively simple functions]
43
determine the specific situations in which a Boolean function can be satisfied. The Ksat problem
seems very important and I do hope that discussion of it will continue.
4.0 A BRIEF SUMMARY
My belief is that the concept of a Boolean function is vital in so many studies of discovery
and invention. Thanks to those who have studied these functions, we have ways of determining
specifically how these functions may be satisfied. I have shown how Boolean functions, and
elements that arise in their decomposition, are useful in capturing attributes of the geneticallyinspired evolutionary computation approach to search processes in engineering design. Equally
important are their applications in other areas in which many forms of evidence are employed to
generate and test hypotheses about events or phenomena in law, intelligence analysis, and other
important investigative areas. In addition, such functions arise naturally in abstract studies of how
complex situations involving hypothesis generation and evaluation might be profitably
investigated. I am certainly not the only person to observe the value of construing many different
problems in terms of Boolean functions. But I do hope that my present collection of ideas adds a
bit to the discussion of discovery and invention and that it will help generate your own further
thoughts about these very interesting and very complex intellectual processes.
REFERENCES
Arciszewski, T., Sauer, T., Schum, D. Conceptual Designing: Chaos-Based Approach. Journal
of Intelligent and Fuzzy Systems. Vol. 13, 2002/2003, 45-60
Birkhoff, G., MacLane, S. A Survey of Modern Algebra. Macmillan, NY, 1965
Borowski, E., Borwein, J. Harper Collins Dictionary of Mathematics. Harper Collins, NY, 1991
Coveny, P., Highfield, R. Frontiers of Complexity: The Search for Order in a Chaotic World.
Fawcett Columbine, NY, 1995
Dumitrescu, D., Lazzerini, B., Jain, L., Dumitrescu, A. Evolutionary Computation. CRC Press,
Boca Raton, Florida, 2000
Gregg, J. Ones and Zeros: Understanding Boolean Algebra, Digital Circuits, and the Logic of
Sets. IEEE Press, NY, 1998
Hintikka, J. Sherlock Holmes Formalized. In: The Sign of Three: Dupin, Holmes, Peirce.
eds Eco, U, Sebeok, T. Indiana University Press, Bloomington IN. 1983. pp 170-178
Hintikka, J., The Concept of Induction in the Light of the Interrogative Approach to Inquiry.
In: Inference, Explanation, and Other Frustrations: Essays in the Philosophy of Science.
ed. Earman, J. University of California Press, Berkeley, 1992
Hintikka, J., Bachman, J. What If?: Toward Excellence in Reasoning. Mayfield, Mountain View,
CA, 1991
Hintikka, J., Hintikka, M. Sherlock Holmes Confronts Modern Logic: Toward a Theory of
Information-Seeking Through Questioning. In: The Sign of Three: Dupin, Holmes, Peirce.
eds Eco, U, Sebeok, T. Indiana University Press, Bloomington IN. 1983. pp 154-169
Holland, J. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann
Arbor, MI, 1975.
Hunt, C. Agent-Based Evidence Marshaling: Agent-Based Creative Processes for Discovering
and Forming Emergent Scenarios and Hypotheses. Doctoral Dissertation,
George Mason University. 12 May, 2001
Kadane, J., Schum, D. A Probabilistic Analysis of the Sacco and Vanzetti Evidence. Wiley
& Sons, NY, 1996
Kauffman, S. The Origins of Order: Self-Organization and Selection in Evolution. Oxford
University Press, NY, 1993
Kauffman, S. At Home in the Universe. Oxford University Press, NY, 1995
Kauffman, S. Investigations. Oxford University Press, 2000
Keeney, R., Raiffa, H. Decisions with Multiple Objectives: Preferences and Value Tradeoffs.
Wiley & Sons, NY, 1976
Levi, I. The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance.
MIT Press, 1983
44
Levi, I. Decisions and Revisions: Philosophical Essays on Knowledge and Value. Cambridge
University Press. 1984
Levi, I. The Fixation of Belief and Its Undoing: Changing Beliefs through Inquiry. Cambridge
University Press. 1991
Pfeiffer, P. Sets, Events, and Switching. McGraw-Hill, NY, 1964
Pfeiffer, P. Concepts of Probability Theory. McGraw-Hill, NY, 1965
Pfeiffer, P. Probability for Applications. Springer-Verlag, NY, 1990
Pfeiffer, P., Schum, D. Introduction to Applied Probability. Academic Press, NY, 1973
Schum, D. Marshaling Thoughts and Evidence During Fact Investigation. South Texas Law
Review. Vol. 40, No 2, Summer, 1999, pp 401-454
Schum, D., Discovery, Invention, and Their Enhancement, First International Conference:
Innovation in Architecture, Engineering, and Construction. Loughborough University,
Great Britain. 15 January, 2001
Shackle, G. L. S. Decision, Order, and Time in Human Affairs. Cambridge University Press,
1968
Shafer, G. A Mathematical Theory of Evidence. Princeton University Press, 1976
Thorp, E. Elementary Probability. Wiley & Sons, NY, 1966
Von Neumann, J., Morgenstern, O. The Theory of Games and Economic Behavior.
Princeton University Press, 1946.
Wigmore, J. The Science of Judicial Proof: As Given by Logic, Psychology, and General
Experience and Illustrated in Judicial Trials. 3rd Ed. Little, Brown & Co. Boston,
1937.
Download