1959 PROCEEDINGS OF THE WESTERN JOINT COMPUTER CONFERENCE 277 A New Approach to High-Speed Logic w. D. ROWEt INTRODUCTION FUNDAMENTALS OF PARALLEL-PARALLEL LOGIC OST approaches to the construction of highspeed computers and logic systems in the past have been first, to go from serial to parallel circuitry, and then to use faster and faster components to increase the operational speed of the circuitry. Highspeed components have reached the state where we are not talking about switching times in the order of the light-foot. That is, if we have a switching circuit with a switching time of one millimicrosecond, the theoretical limit in which we can switch this information from one point to another requires that the separation distance be less than one foot. This is the theoretical limit imposed by the velocity of light and shows that we are rapidly approaching the limits of switching speed. It is therefore necessary that we look again at the organization and utilization of logic techniques to determine whether there are not other means of obtaining high-speed operation from logic circuitry, and thereby of bypassing the need for extremely fast components. It has long been recognized! that one solution to this problem would be the use of canonical logic forms of either the minterm (sum of products) or maxterm (product of sums) form.2 The practical application of this technique has long been hampered by the lack of a suitable electronic device capable of responding to the multiple input-output loadings required. This paper discusses a transistor device, the "Modified NOR Circuit," which is capable of handling up to 25 inputs and 25 outputs with response times of about 80 m}-'sec. This device is applied to the design of a high-speed adder and a high-speed counter. The design approach used here is termed "parallelparallel" logic. This term arises from the fact that not only is the function constructed in parallel, but the logic is also constructed in parallel. A comparison of this logic with others is as follows: With a logic circuit that has an infinite number of inputs and outputs and if all input signals and their complements are available, logic arrays can be constructed using only one level of logic circuits. The advantage of this lies in the fact that the complete operation time of a logic array consists of only one logic circuit propagation time. This means that the maximum speed of a logic array is the same as the maximum operating speed of a single logic component. To explain this procedure, it can be shown that any logical operation can be written using conventional Boolean notations (where+indicates an OR operation, and· represents an AND operation, and - a complement) as a sum of products or a product of sums, i.e., M Se[ial Logic Function-Serial Logic-Serial Parallel Logic Function-Parallel Logic-Serial Parallel-Parallel Logic Function-Parallel Logic-Parallel t Westinghouse Electric Corp., Buffalo, N. Y. 1 R. K. Richards, "Arithmetic Operations in Digital Computers," D. Van Nostrand Co., Inc., New York, N. Y.; 1955. 2 M. Phister, Jr., "Logical Design of Digital Computers," John Wiley and Sons, Inc., New York, N. Y.; 1958. The left-hand expression in the equation may be represented by a number of AND circuits (first level) working into a single OR gate (second level) (Fig. 1), while the right-hand (equivalent) expression can be represented by a number of OR gates (first level) working into an AND gate (second level) (Fig. 2). It is then obvious that if each logic circuit has no limit in its fan-in and fan-out, all logic can be done on two levels or a depth of two. Now, the OR operation is electronically unique, and can often be performed by a simple junction of leads without any logic elements. In the case of the NOR logic element, to be discussed below, the OR operation need not be performed at all, since this element accepts multiple inputs and performs the necessary OR operation implicitly. If all signal complements are not available, a third level is required for negation. Since most signals come / from bistable registers in parallel-parallel operation, both the signal and its complement are almost always available. The insertion of a third level in some inputs and not others can cause occurrence of certain race conditions if care is not exercised. Memory and counting circuits also require special consideration. Previous papers have discussed the use of a particular universal logic circuit called a "NOR" circuit.3,4 The logic of this circuit is such that an output exists if, and only if, neither input A NOR, B NOR, C NOR, etc., is present. (See Fig. 3.) This circuit is universal in that it can perform all logic functions when combined in various forms with other NOR circuits. The right3 W. D. Rowe, "The transistor NOR circuit," 1957 WESCON CONVENTION RECORD, pt. 4, pp. 231-245. 4 W. D. Rowe and T. A. Jeeves, "The NORDIC II computer," 1957 WESCON CONVENTION RECORD, pt. 4, pp. 85-95. From the collection of the Computer History Museum (www.computerhistory.org) 278 1959 PROCEEDINGS OF THE WESTERN JOINT COMPUTER CONFERENCE y :: LEVEL 2 (a) A D oC 0 LEVEL I 0 0 0 0 (p) Fig. i-Representation of two-levellogic-AND-NOR. Fig. 3-NOR logic; (a) diagrammatic NOR; (b) truth table. Logical expression: an output appears at C if neither input A nor input B is present. y LEVEL I LEVEL 2 LEVEL I Y = (Xl LEVEL 2 + X2)(XI + X2) Y Fig. 2-Representation of two-Ievellogic-OR-AND. = (Xl + X2)(XI + X2) Fig. 4-Representation of two-Ievellogic-NOR-NOR. hand expression of (1) can be expressed simply by operation of several NOR circuits (first level) into another NOR circuit (second level). (See Fig. 4.) (Where the output of the array is fed to other similar circuits the second level may be omitted, as remarked above.) The' particular advantage of using the NOR circuit instead of English logic circuits is twofold. First, only one type of circuit is required, so that only a single propagation constant is necessary; and second, it is possible to build NOR circuits with numbers of inputs and outputs that are quite large, in the order of 2S inputs and 2S outputs. This number is sufficiently large for many applications. utilizing a maximum of two inputs per logic circuit. The depth is five, so that five propagation times are required. The logic expression for this circuit is shown to be s = [xy TO A FULL ADDER The examples used so far are trivial in that they are two levels for both the parallel and the parallel-parallel cases. In order to illustrate more fully the advantages of the parallel-parallel method, a slightly more sophisticated example will be examined. This case is that of a full adder circuit. A basic full adder (Fig. S) provides the sum of two addend variables and the carry variable from the preceding stage in binary form. The circuit is built up of two half-adders xy + c], (2) which is derived directly by the substitution of the output of one-half adder which can be expressed 1ll two eq ui valen t ways: H = xy + xy + I (x + y)(x + y) (3) into the input of the second half adder, expressed as s ApPLICATION OF PARALLEL-PARALLEL TECHNIQUE + xy + c][xy + = (H + C) (H + C) (4) to form (2). Here x and yare the two addend variables and C the preceding carry variable. Manipulation of this equation brings it into parallelparallel form so that the equivalent expression is s = (x + y + c) (x + y + c)(x + y + c) (x + y + c). (S) The derivation of this equation frou{ (2) is shown in Appendix I. This equation leads to the circuit of Fig. 6, where multi-input logic circuits are used. The depth of logic is now only two. From the collection of the Computer History Museum (www.computerhistory.org) 279 Rowe: A New Approach to High-SPeed Logic c POSITIVE BIAS ~s~u~pp~~Vy--------~-------------------------oV8B ~G=R~OU=N~D~ -+____ __________________-oG ______ ~ RI M N INPUTS OUTPUTS NEGATIVE ~V~O~~~A~GE~SU~P~P~~Y----------~------~--------~Vcc Fig. 7-Basic transistor NOR circuit. THE MODIFIED TRANSISTOR NOR CIRCUIT '.[(x + y)(x + y) + cH(x + y)(x + y) + cJ = s Fig, 5-Full adder consisting of two half-adders, using only two inputs per NOR. Depth = 5. y c In order to facilitate the use of parallel logic it is necessary to have a circuit with as large a number of inputs (fan-in) and outputs (fan-out) as possible. Since it is extremely desirable to use only a single logic circuit, the transistor NOR circuit was selected for investigation. The basic transistor NOR circuit is shown in Fig. 7. A negative voltage on any of the inputs, M, is sufficient to cause the transistor to saturate and supply a ground potential signal to the outputs, N. The absence of a negative voltage on the inputs causes the positive bias voltage to maintain the transistor at cut-off. Under this condition, the transistor being in a very high impedance state, the outputs see a negative voltage equal to Rr output voltage = Vee xRe s = (x + y + c)(x + y + c)(x + y + c)(x + y + c) Fig. 6-Full adder consisting of parallel-parallel logic. In the example, fewer logic circuits are required in the parallel-parallel case, since most of the logic is accomplished in inputs of the logic circuits and the interwiring. This does not always hold true. The worst case of parallel-parallel logic has a maximum of 2n -1 firstlevel logic circuits, where n is the number of inputs to each first-level logic circuit. There are q second-level logic circuits, where q is the number of desired outputs. Fortunately, most applications are so specialized that only a few first-level logic circuits are required. However, in the worst case, the second-level logic circuits must handle 2n -1 inputs. I t is possible to reduce the number of logic circuits by compromi,~ing on depth, as was done by Weinberger and Smitn. 5 5 A Weinbeq~er a.nd]. L. SIl?-ith, "A one-microsecond adder using one megacycle ClrcUltry," IRE'TRANS. ON ELECTRONIC COMPUTERS vol. EC-5, pp. 65-73; June, 19,56. ' + Rr , where Rr is the input resistor value of a NOR circuit being driven by one of the NOR circuit outputs, N; Rc is the collector load resistor; and x is the number of outputs, N, actually connected to the inputs of other NOR circuits. Only one NOR circuit input is driven by one NOR circuit output. This equation shows that the output voltage is reduced as more inputs of succeeding circuits are connected to the outputs of the NOR circuit in question. If the minimum voltage that appears under the fully loaded condition is called a "one" (this voltage being sufficient to saturate a succeeding NOR), and the absence of this voltage (ground) is called a "zero," the logic conditions of Fig. 3 are fulfilled and this is the basic NOR operation. Since all inputs are resistive, and most of the logic is accomplished in the input network, only a small number of transistors are required as compared to the number of resistors. This is certainly desirable, as most of the logic is now accomplished by the wiring interconnections and resistors, which can be inexpensive, reliable, and miniaturized elements. For extreme reliability, the basic NOR circuit seems to have a maximum fan-in, fan-out of six, which is From the collection of the Computer History Museum (www.computerhistory.org) 280 1959 PROCEEDINGS OF THE WESTERN JOINT COMPUTER CONFERENCE +24 VOLTS GROUND 24 VOLT 2 ZENER DIODE 1-----~2 N ------~------~----~24 24 '-------0 (a) 25 25 -20 Vee -250 VOLTS Fig. 8-The modified transistor NOR circuit-type A. 'n UJ ::7! ct fairly small when one is considering parallel-parallel logic. In order to increase the fan-in and fan-out to about 25, a more suitable number, the basic NOR circuit has been modified so that the output voltage remains at essentially power-supply potential regardless of output loading. This allows the circuit to drive many more outputs than is possible in the basic circuit. One means of modifying the NOR circuit is shown in Fig. 8. A 24-volt breakdown Zener diode is placed across the collector and emitter of the transistor such that output will never fall below 24 volts when the transistor is cut off (output load is restricted t'o never exceed a condition where the output voltage will fall below this value). Under this condition the current derived at the high voltage Vcc acting through the resistor is shunted between the load and the Zener diode. As the load varies (due to changes in the number of outputs used) the excess current is shunted through the diode so that the load is continually driven from a 24-volt output signal, during a "one" output condition. I t is certainly possible to replace the diode with a resistor whose value is chosen such that output voltage is always in the fully loaded condition. This reduces the circuit flexibility since a different resistor value is required for every different condition of output loading. A basic disadvantage of this circuit is its high power dissipation, arising from the high-voltage power supply acting across resistorl Rc. This disad van tage has been eliminated by the design of a second type of modified NOR circuit. This circuit is based on the fact that under certain conditions a transistor makes an excellent constant voltage source. This operation is described in Fig. 9. A constant emitter current source is derived by the constant voltage difference between bias voltages (VEE and V CE) acting across emitter resistor R E. This contains the maximum collector current to be I - I- I ! - VEE V CE - - - - - - - - - - A = I cmax , RB where I- VEE! >!- VCE! and A is the current gain of -15 a::: UJ a. :J ORDINARY LOAD LINE -10 ..J i ~ CONSTANT VOLTAGE PORTION ~ -5 0~----+-----~-----+-----4--~~~--­ o -5 -10 VOLTS (b) Fig. 9-(a) Constant voltage circuit, (b) circuit operation plot. the transistor. Below this value a range of constant potential exists. An actual plot of this can be made by observing the meters in the circuit of Fig. 9(a) when the load resistor RL is varied. This plot is shown in Fig. 9(b) for a particular case where Icmax = 20 rna and V CE = - 24 volts. The constant voltage portion is shown, and this is the operating range utilized. This circuit is then combined with the NOR circuit to make a very effective modified NOR circuit. (See Fig. 10.) Transistor Tl is the basic logic switch, and transistor T2 is the constant voltage source. The power dissipation of this circuit is considerably less than that of Type A circuit. The silicon diode placed in the forward direction between base and ground prevents the base from being overloaded when a plurality of inputs have "one" signals applied. This connection makes use of the high forward-voltage drop of most silicon diodes. This circuit has actually been constructed using micro-alloy diffused base transistors. The circuit has a simultaneous fan-in and fan-out of 25 with an average propagation time in the order of 80 mJ.tsec. In actually testing circuits in various applications evidence points out that the average propagation tim~ is a more significant measurement of operation speed than rise, fall, and storage' times considered' individually. Average propagation time is measured by taking a series string of n logic circuits and applying a pulse to the first stage. The average propagation time is the time delay for the pulse to be propagated from the first to the From the collection of the Computer History Museum (www.computerhistory.org) Rowe: A New Approach to High-Speed Logic 281 +2 GND R, RS 5, T, 2 2 R, N M R, 24 RE 25 t=" 25 Fig. lO-Modified transistor NOR circuit-type B. Fig. ll-Three-stage high-speed carry circuit. last stage divided by n, the number of logic circuits. In any particular NOR circuit the actual propagation time varies only a few per cent from the average, in the majority of cases. Since rise, fall, and delay times do not add directly to give actual operating time of compounded circuits, average propagation seems to be a more significant means of circuit operation time measurement. C 1 = DI + RICO = C 2 = D2 + R2DI + R 2R IC O C3 (10) DI + R3D2 + R3R2DI + R3R 2RIC Dn + RnDn-l + RnRn-lDn-2 + + RnRn-l ... R 2R C = D3 Cn = 1 ApPLICATION OF PARALLEL-PARALLEL LOGIC TO HIGH-SPEED ADDITION In order to illustrate the effectiveness of parallelparallel logic, an application consisting of the circuitry for the carry operation of a high-speed adder will be shown in detail. One of the major difficulties in designing high-speed adding devices is the drawback of ripple-through carry.l When making a binary addition, the most significant bit is dependent on the carry signal from the preceding stage, which is in turn dependent on the carry signal of its preceding stage, etc. This means that the most significant bit is dependent on the condition of the least significant and every other bit in order. It has been shown 1 ,5 that the expression for a carry signal from any particular stage k of a binary adder may be given as (7) DK RK = = AK·BK AK + BK (8) are used for convenience, the carry signals for binary adder of n bits may be expanded. Then Co = Co. (9) Generally, since there is no carry into the least significant stage, Co is O. O (12) O. This shows that the carry for any bit n is immediately available from a single logic array for each bit. Furthermore, some of the logic expressions that are used in determining the carry of a particular bit are used in all the succeeding bits, so that much of the circuitry of each bit is necessarily repeated throughout. Actual circuitry for a three-bit carry circuit is shown in Fig. 11. The carry for each bit is arrived at after only two levels of logic circuitry, regardless of the bit position. Also, the adder of Fig. 6 may be used as the adder circuit, with the negation of the carry derived directly from similar circuitry. Total addition time is then exactly four propagation times or approximately 320 m,usec, regardless of the number of bits in the adder. For the basic carry circuit the number of logic circuits (L) is L = n L: (X + 2) :1:=1 where CK is the carry signal from stage K, CK - 1 is the carry signal from the preceding stage, and AK and BK are the addends associated with bit K. If the substitutions (11) = n(n + 5) = (1/2)n 2 2 + (5/2)n, (14) where n is the number of bits. The maximum number of inputs required by any logic circuit is n+ 1, and the maximum number of outputs is max k(n-k+1), which is n(n+2)/4 if n is even, and (n+ 1)2/4 if n is odd, where k is bounded by 1 ~ k ~ n. Then for the addition of two 20-bit words (without sign) a logic circuit with a fan-in of 21 and a fan-out of 110 is required. (The fan-out can be kept below 25 by using a logical design other than that given in Fig. 11. See Appendix II.) The carry circuit will require 250 NOR circuits. This compares with 80 logic circuits (fan-in fan-out of 3) for a particular twenty-bit ripple carry circuit in use in the NORDIC II computer (4 circuits per bit).4 From the collection of the Computer History Museum (www.computerhistory.org) 282 1959 PROCEEDINGS OF THE WESTERN JOINT COMPUTER CONFERENCE Fig. 12-Reversible binary counter. Eq. (14) shows that by compromising on depth and thereby speed, the 20-bit carry circuit can be broken down into two 10-bit carry circuits, each requiring 75 logic circuits with a fan-in of 11 and a fan-out of 30. However, the depth is now four, half the speed of the 20-bit carry case. For a depth of eight, a total of 100 logic circuits are required, since four 5-bit carry circuits may be used. Only 6 inputs and 9 outputs are required for each level. This demonstrates the flexibility that is available by compromising on speed. An example of what can be done with only a limited fan-in and fan-out is the SEAC high-speed adder.5 ApPLICATION OF PARALLEL-PARALLEL LOGIC TO COUNTING CIRCUITS Some circuits, such as counter chains, do not fall directly into easy application of parallel-parallel logic. I t is possible to treat such operation in a similar manner by making all counters and similar devices operated by parallel logical means, instead of direct sequential means. To illustrate this point, an ordinary binary counter has been examined to determine the logic involved in switching each stage by a count pulse. Then by considering each counter stage as a logical input, and determining from this the necessary changes in each counter stage for the next count pulse, we can effect a complete count cycle in only two propagation times, plus the switching time of a counter. An example of such a binary counter is shown in Fig. 12 for 4 bits. It is reversible in that it can count up or down. The condition of the counter before a count pulse determines which counters should be changed when the count pulse appears. When all the counters have been changed, the new condition determines the change for the next succeeding count pulse, etc. The limit of the size of a binary counter of this nature depends upon the maximum allowable fan-in, fan-out of the NOR circuits used. Thus for a 25-input-output NOR circuit, a 24-bit, simultaneous advance, binary counter is possible. It is also obvious that the counter code used is restricted only by the logic used. Therefore, a counter of any desired code is possible. RACING AND TIMING CONDITIONS When every input signal is required to travel in paths of equal length to the output (i.e., all inputs extend through the same depth), the only conditions of racing that can occur are due to the variation of the propaga- tion time of any logic circuit from the average propagation time. This condition is easily bypassed since parallel-parallel operation is exactly synchronous. This means that all output signals appear simultaneously after <l predetermined time delay, when input conditions are changed. This time delay (taken for the deepest depth used) sets the minimum time, between operations. False conditions due to racing occur only within this time period and disappear at the termination. All racing of this type is easily suppressed by gating the outputs so they do not read out during the circuit propagation time. CONCLUSIONS Parallel-parallel logic operation allows circuit operation at very high speed without the use of much higher speed components. The sacrifice for obtaining this operation is the number of logic circuits required and the topological and geometrical problems that arise from the complications of interwiring. Fortunately, these problems are not unduly severe for most actual applications. The use of the modified NOR circuit allows operation with only one type of logic circuit with a large fan-in and fan-out. Furthermore, the logic is accomplished in the resistor gating and interwiring most of the time. This means that the actual number of transistors used is small compared to the amount of logic accomplished. The attempt in this paper has been to outline briefly some of the possibilities that lie in this type of logic, along with its limitations. Although considerable work has been done in this area, it is too lengthy and detailed for inclusion here. Table I is a summary and comparison of the different types of logic and their applications, and simply expresses the basic ideas involved. TABLE I Parallel Parallel-Parallel Logic and function by sequence Function is parallel Logic is sequence Function is parallel Logic is parallel Slow Fast Very fast Minimum equipment redundancy High equipment redundancy High Equipment redundancy Serial ApPENDIX I LOGIC MANIPULATION OF THE FULL ADDER Half sum = xy + xy = Full sum = HC + HC H. (15) = S, (16) where C is the carry from the preceding stage. H = (x + y) (x + y) = xx + xy + yy + xy = xy + xy (17) since xx = yy = O. S = (H + C) (H + C). (18) + c][xy + xy + d. (19) By similar methods, s = [xy+ xy From the collection of the Computer History Museum (www.computerhistory.org) 283 Cochran: Information Retrieval Study By~substituting s= (15) into (19), [(xy)(xy) + c][xy + xy + e]. (20) Since A +B =A ·B, s= [(x + y)(x + y) + c][xy + xy + e]. (21) Since A·B=A+B, s = [xy + xy + c][xy + xy + e], (22) + xyc + xyc + xyc, (23) which gives s = xyc since xx=yy=O. This is in AND-OR form. To get the more desirable (for NOR logic) OR-AND form, s = (x +y + c) + (x + y + c) + (x + Y+ c) + ex +y +C). ApPENDIX (24) Since AB=A+B, + y + c) (x + y + c) (x + y + c)(x + y + c), (xy + xc + xy + yc + xc + yc + c) (xy + xc + xy + yc + xc + yc + c). s = (x s= Fig. 13-Three-stage high-speed carry circuit, limited loading case. (25) (26) In order to constrain the number of outputs to a single circuit for such a matrix as the simultaneous carry circuit to the limit of n+l, more circuits can be used to parallel the loading. This is indicated in Fig. 13 which shows the high speed for three bits with a maximum of n+l inputs and outputs on each logic circuit. The number of circuits required is now L = n(n Since A·A =A, s= xyc + xyc + xyc + xyc; (27) since A +A =A, s -- -- = (xyc) (xyc ) (xyc ) (xyc) ; (28) = (x + 2) = n 2 + 2n, which is certainly much larger than the previous case. This shows that an unlimited fan-out is desirable. For limited cases, this may be compromised by using parallel logic circuits. In either circumstance, the depth of levels remains unchanged. since A+B=A·B, s II ACKNOWLEDGMENT + y + c) (x + y + c)(x + y + c) (x + y + c). (29) Eq. (29) is the final form. The author is grateful for the efforts of Dr. T. A. Jeeves who initially suggested this approach and who has contributed to much of its development. Information Retrieval Study* ROBERT COCHRANt A S is . well-known in the computer field, the basic scheme of information retrieval is to search a given set of data and extract those records which satisfy a certain set of criteria. While many schemes of information retrieval are being used, new techniques are always being considered and tested so that this very important task of computer technology can be perfected ~ * Based on work done for the U. S. Army Electronic Proving Ground, Ft. Huachuca, Ariz. . t Computer Applications Sec., Computer Dept., G.E. Co., Phoenix, Ariz. relative to any given set of circumstances. The impetus to develop more effective methods for recalling and correlating recorded information originated in the field of science and technology. It was evident more than 10 years ago that considerable improvements in methods for using recorded knowledge were long overdue. The search still goes on-to study, to invent, and to perfect. The utility of information retrieval is self-evident. As more and more records and documents come into being, the problem multiplies in complexity. The need for swift recall is paramount. For example, in medicine, the From the collection of the Computer History Museum (www.computerhistory.org)