Logic Enumeration based Learning Machines James Thomas Parker, MS Post Graduate Department of ECE NCSU Raleigh, NC jtparker123@juno.com 910-798-3052 May 15, 2015 Abstract: Some Learning Machines in Artificial Intelligence have the ability to take a partially filled truth table and search for a similar truth table that will match the supplied entries so it can fill in unknown outputs for unknown entries. This paper provides an argument for a new concept called “circuit space” which identifies truth tables that are close in proximity to the partial truth table based on similar logic circuits which can generate them. It is just an introduction to this area of research based on a study using exhaustive logic enumeration. Keywords: Logic Gate Enumeration, Complexity, Truth Tables, Randomness, Learning Machines, Cryptography 1 Introduction It is generally believed there are some similarities between an Artificial Intelligence (AI) Learning Machine and the neurons in a human brain.[8] We consider the case where a partial truth table is “taught” to an AI machine and the missing entries of this truth table are asked to be filled in. If just a random algorithm sought to generate the complete truth table then the answers the AI machine supplied would be meaningless. For example, it is well known that we can always find a function that will replicate a given sequence of integers but the function could have no meaning when we try to extend the sequence it generates. If f(x1)=y1 and f(x2)=y2 and x1 != x2, we could always let f(x)=(xx2)(y1/(x1-x2))+(x-x1)(y2/(x2-x1)) but this function does not provide much insight into what is a reasonable value for f(x3). What is actually desired is a function that will replicate the initial sequence of numbers and logically correlate with it so the extended sequence it generates will be plausible. But how do we find a function that statistically correlates with the desired function? We will be looking for a function that is close in proximity to the desired function based on logical foundations. In computer science, we often talk about state space. An example of this would be three variables, x,y,z, that could each take on the integer values 1-10. The space they would represent would have 3 dimensions and 1000 possible states. We often have an algorithm that searches the state space for a location where the 3 variables would optimize some scoring function f(x,y,z) such that this function reached a minimum or maximum value. Genetic algorithms use this approach extensively. [15] This paper proposes the use of what it calls “circuit space”. It is not completely clear at the writing of this paper what that space would look like but it would have at least 4 primary dimensions where qualities of a truth table would locate it in that space. The object of a Learning Machine using circuit space would be to find truth tables that are close to each other in this space. The main objective of this paper is to make the reader aware that such a space exists and show what some of its properties are so it could be better defined. We will study in this paper truth tables that output a single 1 or 0 in response to binary data input. They will represent a partial function because not all of the binary input values will be supplied with corresponding outputs. Correlations between candidate truth table functions that are paired with the truth table under study will be based on truth table metrics especially created by the author. The candidate truth tables complete the partial truth table and thus they will be referred to as “fulfilling tables”. Complexity can be a useful metric and it is used sometimes in this paper. One may argue that there are many metrics that could be used to find relevant truth tables to a partially filled truth table and they would be correct. The use of complexity is just one tool for Learning Machines developed by the author in this paper. It is generally believed by the research community that randomness in a truth table leads to computational complexity. We will show that this is usually the case but not always when we examine logic enumerated tables. Because this is not always the case, we look for additional ways to measure the dimensions of what we call circuit space. With one approach, the number of gates in a logic circuit are incrementally increased with exhaustive circuit enumeration until an electrical topology is found that generates the desired truth table. Then because further research is required, the number of logic gates in that circuit is a rough metric of the computational complexity of that truth table. In other words, a logic simulator that performed this circuit function would have to work harder if the circuit contained more gates so it is considered more complex. This information can be useful as a metric in determining which truth tables best complete or fulfill a partial truth table. Over 20 years was spent studying the enumeration of electrical topologies and although one may wonder why this curiosity was of such interest, it led to the discovery of how to enumerate all possible logic circuits of a particular type in a computational time that was a great reduction of the unavoidable factorial explosion that occurs. That led to a newly created metrics based on what we call logic enumeration. On a side note, the author defines pseudo-random encoding as some type of scrambling of information that is not truly random but the computer algorithm that scrambles the data is very unpredictable by anyone except the person that knows the exact algorithm used. The word has been borrowed from the notion of pseudo-random number generators. By what the author calls super-polynomial encryption, a very difficult problem to solve such as a NP-complete problem[2] must be cracked in order to open the door to information that cannot be easily decoded except by the intended receiver of the message.[12] This paper could lead to the conclusion that complex pseudo-random encoded data could still reveal statistical relationships of the data and this could lead to valuable insight for a talented person to find some way to compromise the encoding method. The paper is intended to be a straightforward read for someone with a general background in computer science. Previous concepts lead to the following concepts throughout the paper. Given a partially filled truth table, we will in section 7 examine a creative way to search for fulfilling truth tables. This has some interesting implications for logic enumeration based Learning Machines and thus some potential applications. 2 Enumeration of Combinational Logic Circuits It is necessary in this paper to enumerate the possible combinational logic (CL) circuits that will perform a given truth table function in order to measure its complexity. We restrict the enumeration to acyclic circuits ( having no feedback and thus no internal registers ) since those are the only type of circuits needed to implement a truth table. Initially, we restrict our focus to NAND gates only since all other two-input logic gates can be implemented with fewer than five two-input NAND gates.[11] The bar diagram is a self-invented technique to describe a logical circuit consisting of NAND gates. We use it to show how one could enumerate all possible gates with i inputs. We will first illustrate the bar diagram using a specific example and a diagram. Consider the example shown in figure 1 of a CL circuit with 3 inputs and one output. In this example, there are five two-input gates, and all of the gate inputs and outputs must be used in an enumeration search for the simplist circuit. Since there are no feedback loops, the gates are arranged in a hierarchy with the bottom gate only able to connect to the circuit inputs and the top gate able to connect to the circuit inputs and all the outputs of all the gates below it. The output of the top gate is the only output of the circuit. Figure 2 shows the bar diagram which corresponds to this circuit. Figure 1. Example of a Combinational Logic Circuit with Three Inputs and One Output Figure 2. Bar Diagram for CL Circuit of Figure 1 The horizontal bars, or simply bars, represent the sources of signals. The bars labeled with positive values are the outputs of each NAND gate used in the circuit, and the bars labeled with non-positive values are the circuit inputs. All of the gate inputs are represented by vertical lines. The bar that connects to a vertical line must have a label with a value less than that of the label of the vertical line. This prevents a logic gate from connecting to circuits that feed back to itself. It took many years for the author to learn the secret of enumerating all possible circuits like those shown in figure 2 in a reasonable computation time. The secret we found was to start with n+1 bars where n is the number of NAND gates in the circuit under test and enumerate ALL possible circuits for those n+1 bars before we tried n+2 bars. We continue enumerating all possible circuits for each increment in the number of bars until we have achieved the desired truth table or we just can not add any more bars without having unused bars. If we still have not achieved the desired truth table, then we add one more NAND gate to the circuit under test and start over. If we stick to this strategy, we will greatly reduce the factorial explosion that is inevitable. 2.1 Complexity Calculations In this section, we establish the number of gates required to implement all possible truth tables (functions) with i inputs. First we determine the number of possible functions with i inputs. Then we compute the minimum number of gates (g) required to implement all of the possible functions with i inputs using exhaustive logic enumeration as described at the beginning of section 2. 2.1.1 Number of Possible Functions i i Note that for i inputs, there are 22 possible functions. To show that there are 22 functions for i inputs, consider the case where there are three inputs. Table I and Table II show two possible functions a CL circuit could implement with three inputs and one output. We see that there are 23 = 8 rows for each truth table and each row can have an output of 0 or 1. 3 Thus, there are 28 = 2 2 possible unique truth tables or functions. In general, for i inputs i there are 2 2 possible functions. Table I. Function One abc Output 000 0 001 1 010 1 011 1 100 0 101 0 110 0 111 1 Table II. Function Two abc Output 000 1 001 0 010 1 011 1 100 0 101 0 110 1 111 0 2.1.2 Worst-case function With enumeration we can now determine the complexity of the worst case function. The worst case function is the truth table given i inputs that requires the largest number of gates to be implemented when only NAND gates are used and they are all used efficiently. Efficient use implies that there does not exist a circuit with fewer NAND gates that can achieve the same function. It is not necessarily the case that the worst case function is unique. There may be many that require the maximum number of efficient gates to be implemented. We are not attempting to determine which functions are worst case, but rather the number of gates (complexity) needed to implement the worst case function. A worst case function and its corresponding circuit for i 3 is shown below. The function implemented by this circuit can be described as: “If any input is a 1 and the other two inputs are 0, the output will be 1. Any other input combination will result in an output of 0.” It requires ten NAND gates to implement. If only two-input NAND gates are used, this function cannot be implemented with fewer gates. Thus, all gates are used efficiently. abc 000 001 010 011 100 101 110 111 Output 0 1 1 0 1 0 0 0 Figure 3. Circuit with Unique Worst-Case truth table using Three Inputs and One Output that utilizes only two input NAND gates 3 Karnaugh Maps A Karnaugh Map (or K-map) is a more concise representation of a truth table. The table is arranged so that neighboring cells have inputs with a Hamming Distance of one. That means to get from any cell to any neighboring cell only requires that one input bit change its state. Observe table 4 for example. The cell at row 2 column 2 corresponds to input 0101, and the cell immediately below it corresponds to input 0111. K-maps make it easier to see patterns in the truth table. Seeing patterns will be useful later in determining the randomness of a given table. Table III shows an example truth table with four inputs and one output. Table IV shows the corresponding K-map. The columns of the K-map are the four combinations for inputs a and b, and the rows are the four combinations of input for c and d. The outputs for the 16 rows of the truth table are shown within the table of the K-map. Cd Table III. Example Truth Table with 4 Inputs abcd Output 0000 0 0001 0 0010 1 0011 1 0100 1 0101 0 0110 1 0111 1 1000 1 1001 0 1010 0 1011 0 1100 1 1101 0 1110 0 1111 0 Table IV. Karnaugh Map of Table III Ab 00 01 11 10 00 0 1 1 1 01 0 0 0 0 11 1 1 0 0 10 1 1 0 0 4 Logic Enumeration Tables There is a need to explore what features of a truth table contribute to its complexity. In an attempt to discover this, we generated all the possible circuits with four inputs using NOR and XOR in combination as well as NAND and XNOR gates in combination, starting with the minimum number of gates, i.e. three. We simulated circuits with 4 exhaustive logic enumeration until all 22 truth tables were implemented. Using NOR/XOR gates or NAND/XNOR gates result in simpler circuits than using any other two-input, two-gate combination so this greatly reduced computation time. Although, we define complexity of a truth table to be the minimum number of NAND gates needed to implement it, using NOR/XOR or NAND/XNOR gates provides us with a lower bound on the complexity of all 2^2^4=65536 unique truth tables with 4 inputs. Cd Table V. K-map requiring 3 XOR/NOR gates Ab 00 01 11 10 00 1 0 1 1 01 0 0 0 0 11 0 0 0 0 10 0 0 1 1 Cd Table VI. K-map requiring 8 XOR/NOR gates Ab 00 01 11 10 00 1 0 1 0 01 0 0 0 1 11 0 0 0 0 10 0 0 0 1 Tables V and VI show circuits generated from the list of [1]. The NORXOR.txt file enumerates the minimum number of XOR-NOR gates needed to compute a given truth table where every circuit topology is considered. The first entry in each row of this file is the truth table number generated by taking I0 + I1*2 + I2*2^2 + I3*2^3 as an input index where I0 is the lsb of the inputs for the truth tables and I3 is the msb input. That will take on the values 0 to 15 for the rows of a given truth table with 4 bits of input. We then look at the single column outputs for a given truth table as a function of the input indices that we will call IX. The indices take on the values IX0 to IX15 which correspond to the binary values of the 4 input bits. The specific truth table number is calculated as output(IX0) + output(IX1)*2 + output(IX3)*2^2 + … + output(IX15)*2^15. The truth table numbers thus take on the values 0 to 65535 in the same scheme as mentioned in section 2.1.1. The NANDXNOR.txt file is calculated in a similar manner. Note that the K-map of table V appears fairly random but it is simple to implement with NOR/XOR gates. The same K-map requires 7 NAND/XNOR gates but circuits are counted with their minimum value in these two files for their correlation in table 10. The K-map in table VI which requires 8 gates shows a pattern in that the top row is similar to the right hand column and that resists the fact that complex truth tables are random. The NAND/XNOR counterpart also requires 8 gates. Thus our subjective interpretation of what is a complex truth table differs notably with what logic enumeration can tell us. 5 Isomorphism of a Truth Table We explain here what is meant by the isomorphism of a truth table[10]. Consider the following 3 input truth table: abc 000 | 0 001 | 1 010 | 1 011 | 1 100 | 0 101 | 0 110 | 1 111 | 1 The 3 inputs to the circuit can be permuted 3! or 6 ways to yield 6 different isomorphisms. The outputs are rearranged accordingly. Just suppose that we switch inputs “a” and “c” . We get: cba 000 | 0 001 | 0 010 | 1 011 | 1 100 | 1 101 | 0 110 | 1 111 | 1 If we arrange the strings from lsb to msb, 01110011 translates into the isomorphism 00111011 Since permuting the inputs of a circuit should not affect the minimum number of gates required, all isomorphisms of the NOR/XOR and NAND/XNOR tables were verified to be equivalent in their gate count. As an added check, all complements of the truth tables were verified to be within one gate of their counterpart since at worst they would require the same circuit with just a inverter added. The inverter could be created with a NAND gate or a NOR gate that had its two inputs shorted. Finally it was verified that the gate counts were equivalent when the meaning of 0 and 1 was interchanged in both the inputs and the outputs of all truth tables. In this case, it was also necessary to switch NAND gates with NOR gates and XNOR gates with XOR gates. Thus these two files were cross checked within themselves and with each other. 6 The Circuit Classification Algorithm The algorithm below is really the most important result of this paper. It is presented in pigeon C to classify circuits that generate the outputs of a 4 input truth table. That will be correlated in table 10 with its gate count based on logic enumeration. It took several years to discover this algorithm such that its circuit counts correlated in a plausible way with the logic enumeration circuit counts and even then it was discovered by accident. It shows there is a mathematical relationship between truth tables and the simplist logic circuits that implement them. That information will be used to build a theory for the existence of circuit space. int randomvalue(); { int kmap[65536]; int kinp[16],kiso[16]; int i,j,k,n,r; long ii; int lowest; //returns randomization value of input string from console //map the number of unique ways a rotation of bits occur //input string and its isomorphisms //work variables //need long to index kmap //lowest class value among all isomorphisms of input string for (i=0;i<16;i++) scanf(“%d”,&(kinp[i])); //scan in truth table string lsb first. lowest=1000; //set to very high value for (n=0;n<factorial(4);n++) //consider all 24 isomorphisms { for (r=0;r<16;r++) //consider all 16 rotations of string { getisomorphism(n,kiso,kinp); //array kiso is set to nth isomorphism of array kinp for (ii=0;ii<65536;ii++) kmap[ii]=0; //reset map of unique rotation mappings if (r>0) { k=0; for (i=0;k<16;i=(i+r)%17) //rotate i mod a prime 16 times if (i<16) { k++; //shuffle string 16 times j=kiso[i]; kiso[i]=kiso[(i+1)%16]; kiso[(i+1)%16]=j; } } // if r>0 ii=0; for (j=15;j<=0;j--) //now convert string mapped into a long integer ii=2*ii+kiso[j]; kmap[ii]=1; //mark rotation-shuffle mapping }//for r loop j=0; for (ii=0;ii<65536;ii++) j+=kmap[ii]; //count number of unique rotations if (j<lowest) lowest=j; //take isomorphism with lowest class count } // for n loop return lowest; } . Here is the correlation table ( circuit counts ): 65536 Truth Tables 3 Minimum Gates 5 6 4 7 8 class 00 04 05 06 07 08 09 10 11 12 13 14 15 16 00000 00079 00489 00354 00020 00000 00000 00000 00008 00008 00000 00000 00000 00001 00001 00000 00002 00000 00000 00000 00006 00006 00000 00000 00000 00000 00031 00009 00000 00000 00000 00010 00020 00022 00004 00000 00000 00006 00040 00154 00044 00012 00000 00000 00141 00394 00175 00012 00000 00054 00278 00695 00785 00084 00000 00051 00549 01662 01590 00092 00012 00091 00937 04882 06459 00681 00000 00038 01758 10499 17660 03185 00003 00021 00565 03698 05955 01042 00000 00000 00000 00015 00123 00024 The columns and rows all add up to 2^16 which is the number of possible truth tables for a 16 bit string. Rand 00 is a special case where the truth table had bilateral symmetry and it is only included to make the total number of circuits add to the correct number. In a truth table with bilateral symmetry, one or more of the input bits has no impact on the output bit. The output bits will be the same pattern regardless of whether that input bit is a 0 or a 1. Thus that input bit is a “don’t care” and the assumption we used in logic enumeration that all input bits have some effect on the output bit is false. 7 An Analysis of the Circuit Classification Algorithm There is a lot that can be learned from the circuit classification algorithm. It is a fundamental hypothesis of this paper that if a close circuit relationship can be found between a partial truth table and a fulfilling truth table, the fulfilling truth table has a better than random chance of being correct. An example of this hypothesis is demonstrated by figure 3. The complement truth table to the table of figure 3 can be achieved simply by removing the last NAND gate in the circuit. Thus the exact complement of table 3 is in many ways closely related logically to the actual table 3. The minimum circuits of the two tables are also almost identical. The author is interested in defining circuit space. It has been mentioned that it could have 4 primary dimensions. One dimension would express how close two truth tables are when one of them undergoes a complement transformation such as mentioned in the previous paragraph. Another dimension would classify how close two tables are when undergoing an isomorphism transformation. Note that the minimum gate counts are unaffected by this type of transformation. Yet another dimension measures how close two tables are when only a small number of output bits are changed and this is called by the author the Karnaugh space or Hamming distance dimension. Finally, the author defined De Morgan dimension classifies similarities in gate counts between circuits that would be similar if the meaning of 0 and 1 were interchanged. We have hypothesized that truth tables whose minimum circuits are close in circuit space have underlying logic similarities such as the example mentioned with table 3. Understand that two circuits that have equivalent minimum circuit gate counts will not necessarily be logically similar, but circuits that are logically similar should have similar minimum gate counts. It is believed they will contain nodes that have identical Boolean expressions which are used to build the output expressions and so they will have similar gate counts. Further evidence to support this hypothesis is found in table 10 and the use of the circuit classification algorithm. We are using all 4 of the circuit space dimensions mentioned to classify truth tables and the resulting classifications are shown to have a partial relationship to their corresponding truth table gate counts. The fact that this relationship is not completely random shows some type of underlying mathematical basis is at work. This type of investigation is in the infant stages of research but the existence of a circuit classification algorithm that can generate table 10 which is far from being completely random is strong evidence of the existence of some type of circuit space. With a more sharply defined circuit space for truth tables, it should be clearer which fulfilling truth tables are a good fit for a partial truth table given the assumption that there is some type of logical basis in their relationship. We build our science on the beliefs that the universe has a logical basis so looking for logic in the things we observe only makes common sense. 8 Conclusion In section 7, we have identified a metric that allows fulfilling truth tables to complete less than full tables in a statistical way. We propose measuring this metric in a newly defined circuit space. The metric’s plausibility is supported by exhaustive logic enumeration and the newly defined circuit classification algorithm. Further research will be needed to show if this metric does indeed tag truth tables as having similar properties but the way has been paved for a new technique in discovering truth table correlations. A fundamental hypothesis of this paper is that truth tables that are close together in circuit space have a similar logical basis and should have similar logic responses to inputs. The ability to detect patterns in truth tables could possibly increase at a rapid pace if research improved and hardware was dedicated to that purpose. That could imply pseudorandomness in encryption may not be completely secure. The occurrence of cyberhacking is discussed on the news frequently and it is thought to be one of our greatest threats to national security. [7] 9 Acknowledgements I would like to that Dr. Alan Creaser who served for many years as a professor of physics at the University of Hartford and gave me assistance in publishing this paper. He has been a long time friend. I would also like to thank Dr. Clayton Ferner who is a professor in the department of computer science at UNCW and was very patient with me in the development of the early stages of this paper. 10 References [1] Circuit Analysis Zipped Files. Found on Web at http://mathshowcase.com/papers/gates.zip on April 3, 2012 [2] Garey (1979) Computers and Intractability, A Guide to the Theory of NPCompleteness, Pg 6 [7] Power Point Blog on White Noise Encoding. Found on Web at http://mathshowcase.com/papers/ on April 25, 2015 [8] Russell, Artificial Intelligence A Modern Approach pg 567 [10] Hazewinkel, ed. (2001) “Isomorphism” Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 [11] www.allaboutcircuits.com/vol_4/chpt_3/9.html [12]www.businessinsider.com/p-vs-np-millennium-prize-problem-2014-9 [15] Genetic Algorithms in Search, Optimization, and Machine Learniing. David E. Goldberg