THE DESIGN AND CONSTRUCTION OF AN ASYNCHRONOUS PROGRAMMABLE CONTROL STRUCTURE by Donald N. North and James M. Guyer Submitted in Partial Fulfillment of the Requirements for the Degrees of Bachelor of Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May, 1975 Sf Signatures of the Authors .... a....................a.a. a 4........... Department of Electdical Engineering and Computer Science Submitted May 14, 1975 Certified by .............. Thesis Supervisor Accepted by ... Chairmn, Departmental Archives *bSS. INST. (MAY 23 1975 tI8RARIKe Committee on Theses 2 THE DESIGN AND CONSTRUCTION OF AN ASYNCHRONOUS PROGRAMMABLE CONTROL STRUCTURE by Donald North and James Guyer Submitted to the Department of Electrical Engineering and Computer Science on May 9, 1975 in partial fulfillment of the requirements for the degree of Bachelor of Science ABSTRACT This paper is concerned with the design and implementation of a practical asynchronous control structure capable of being easily programmed. Based upon an idea originally conceived by Professor Suhas S. Patil, the structure is a hardware system designed to simulate the concept of a Petri Net and its internal flow of control. Such nets have found useful applications in the analysis of control flow in asynchronous systems; our net simulator, with its integrated interface circuitry to external functional subsystem building blocks, can provide the control structure for an arbitrary asynchronous system. Two versions are described in this paper. The first system, currently under construction, is a prototype medium scale programmable matrix, designed to be directly compatible with the other asynchronous modules in use at the Computation Structures Group Lab, and thus easily interface to the outside world. Note that this entire functional system contains no clocks at all; its inherent speed is solely a function of the logic family used in its construction. Specific details are presented on the complete design , implementation and use of the control structure. The second version is a paper study only of the feasibility of using a smaller programmable matrix ( such as above ) plus some additional control circuitry to simulate a much larger matrix through time multiplexing. The advantages and disadvantages of this approach are explored. THESIS SUPERVISOR: TITLE: Professor Suhas S. Patil Assistant Professor of Electrical Engineering 3 ACKNOWLEDGEMENTS We would like to thank Professor Patil for his ideas about a programmable logic array, which formed the basis for our project, and for his further help as work progressed, especially concerning the changes in the second version of our design. 4 TABLE OF CONTENTS Page Title Page............... ..... e...... ..... e.... Abstract............... Acknowledgements.......... e.. ..... ... ..... ... .. .. ... ..... .. II. III. .. .. .g. .e. 2 . 4 ..... ... .... ..... ... . .e. e...... Introduction.............. I. .. ..... g..... List of Illustrations..... ...... e.. .... .... ........... Table of Contents......... ...... 1 ... ge e......7 ..... . *. ...................................... The Petri Net.... A. The Petri NetMode................. B. Our Control Structure Model......... C. Typical Net Constructions...... D. Matrix Notation for Petri Nets. The Programmable Array............... .... ... .23 .. e.......g....g.......35 A. The Switching Matrix........... ..... B. The Implementation.......... ..... C. The Register Matrix........... The Multiplexed Array..........e .... .... ..... .9 ..... A. The Multiplexing Technique..... ..... B. Implementing the Array......... ..... C. The Memory Controller.......... .g... .e....... .. ........ ... 38 g............g...47 .................... 69 .................... 77 e........ .... ee e .. ...... ........ ee...78 ee...94 Conclusion....*..................ee..e.. References. . g..e.. .. . .. . ............ g gg . .... ... ..... ......... 104 5 LIST OF ILLUSTRATIONS Title Number Page I-1 Representation of places and transitions 11 1-2 A sample Petri net 12 1-3 Places and tokens in a safe net 14 1-4 The steps in the firing of a transition 15 I-5 Representation for the input-output place 18 1-6 Filling/emptying of an input-output place 20 1-7 The decision places 21 1-8 Filling/emptying of a decision place 22 1-9 Creating tokens with a multiple output transition 23 I-10 Merging tokens with a multiple input transition 24 I-11 Selecting paths through a multiple output place 25 1-12 An arbiter 26 1-13 Merging structure corresponding to "if then else" 27 1-14 The general path merging function 28 1-15 The lockout switch 29 1-16 Synchronizing m input to n output paths 29 1-17 Start and stop places 29 1-18 An asynchronous multiplier example 32 II-1 Overall system organization 37 11-2 Diode switching matrix 40 11-3 Substitution of transistors for diodes 42 11-4 Interfacing to and from the switching matrix 44 11-5 A simple transition 49 11-6 Example showing inadequacy of simple transition 49 11-7 A better transition 51 6 Title Number Page 11-8 Situation showing necessity of arbitration 52 11-9 Final implementation of the transition 54 II-10 Emptying a transition's input places 56 II-11 The output place 58 11-12 The input place 60 11-13 Connections for transition to output place 61 11-14 Connections for input place to transition 61 11-15 The internal place 62 11-16 Connections for the internal place 64 11-17 The decision line 65 11-18 Operation of the decision line 66 11-19 The start place 68 11-20 Row-column intersection in register array 71 11-21 Register array address decoding 72 11-22 Register array address circuitry truth table 72 11-23 Generating an array program III-1 Dividing up an array 111-2 Division of representative array pattern 111-3 Multiplexed array control circuitry 86 111-4 The place for the multiplexed array 90 III-5 Timing for a typical place 91 111-6 Simple transition for the multiplexed array 93 111-7 Block diagram of the memory controller 98 111-8 Memory controller circuitry 99 111-9 Timing diagram for memory controller 100 III-10 Memory controller internal data flow 101 75-76 79 81-82 7 Introduction All modern commercial computer Central Processing Unit design is based on synchronous logical realization. hardwired logic. 6 ming . Not all of this is totally Indeed, most CPU's use the technique of microprogram- In this technique a machine language instruction is not direct- ly executed-- each machine instruction causes the execution of a set of microinstructions that each perform a small part of the machine instruction. These microinstructions are stored in a control memory, usually read-only, and must be fetched sequentially to complete each machine instruction. This type of realization is slower than an equivalent com- pletely hardwired implementation of each machine instruction because of the multiplicity of microinstruction fetches that may be required to execute only one machine instruction, but it is much cheaper in two senses. First, it is very flexible, because to change the machine instruc- tion set one need only change the microprogram for each instruction, which can be easily done by changing the contents of the control store. Second, less logic is required as the microinstructions need not be very complex. Asynchronous computation techniques, such as those pioneered by the Computation Structures Group at project MAC 2, are inherently much faster than hardwired synchronous techniques, and are particularly suited to parallel processing situations. Asynchronous techniques allow a pro- cess to proceed as fast as data can flow through the system, with no need for slack time to await clock generated timing signals. They are more expensive to implement as more hardware is generally required for an 8 asynchronous as opposed to an equivalent synchronous system. Theae sys- tems, however, have been attractive research subjects because of the simple solutions they offer to many problems that plague synchronous systems, such as processor coordination without faults, others related to communication. parallel computationand Thus problems of synchronizing and coordinating subsystems are also solved. While they are fast, hardwired asynchronous systems are just as inflexible as hardwired synchronous systems, thus speed increases in synchronous systems are much more easily implemented by upgrading logic component speed, rather then developing a new asynchronous system from the ground up. An ideal and more attractive method would be the combin- ing in some form of asynchronous computation and microprogramming. should retain the speed of asynchronous circuitry as well as its This ease in handling coordination problems, plus would incorporate the flexibility of a microprogrammed system. A system such as this would be a much more practical alternative for a fast system than a hardwired asynchronous design. 9 I* The Petri Net As presented in the intreduction, the use of asynchronous techniques in a microprogrammed cpu could lend itself to a new flexibility in design, from both the standpoints of ultimate attainable speed, and ease in the original design or subsequent modifications of the system. However, before one jumps directly into the specification and ultimate design of a system, it would be wise to have available an adequate tool to model ( describe, specify ) the elements, and the desired function of these elements, in a completely unambiguous manner. Such tools, most likely of a mathematical nature,( i.e., as an extension of some graph theory or formal programming language ) could then be useful, not only in the system design phase, but in further applications concerning the general theory of asynchronous systems. Hardware description lang- uages for synchronous cpus ( e.g., CDL, AHDL, LOTIS ), both hardwired and microprogrammed, currently exist and have found wide applicability in the design and implementation phases for cpus. 9 These tools are programs, generally written in some high level computer language, that simulate the operation of the desired system based upon statements describing (1) the structure ( e.g., physical characteristics such as register layout definitions and interconnections ) and (2) the control flow ( statements relating to the order of performing operations ) of elements in the system. If written correctly, these simulators are applicable to both synchronous and asynchronous systems design ( LOTIS, for example ), but they generally confine the user to the design/simulation of a specifici special purpose system, and have no extensive merit regarding general design theory. Some systems even 10 possess the ability to generate the requisite logic diagrams and wiring lists needed to construct the system ( the CASD hardware description language has such an ability ), but these are for hardwired, synchronous systems, and their efficiency is rather low. Such ability does not exist in any currently well known hardware description language for developing an asynchronous microprogrammed system. To fill this gap, we will now introduce the concept of a Petri Net as a model for the control structure of an asynchronous system, as set forth by Patil in (8). Work in areas similar to this has been conducted by Jump in (5), with his "transition nets", and Holt in (4) with "occurrence systems" ( both are actually forms of Petri nets ). The basic theory present behind our use of the Petri Net model is not much different from either of these; however, when the actual physical implementation is described ( in later sections of this paper ), significant differences will be observed. Specifically, Jump's cellular array does implement the required asynchronous control structure ( implicitly including the ability to interface with external devices ) but lacking on two major points. system (1) His does not possess the ability to conditionally alter the flow through the system based upon decisions, restricting one to the same execution path on each cycle; this is judged to be a serious restriction on the usefulness of his system. And, (2) due to his design method of placing the control circuitry for the array within the array, and not at its boundaries, as we have done, its ability to be "programmed" is poor, requiring a major rewiring of the array to change the structures "program". Our system posseses both conditionals and is easily reprogrammable; and is 11 thus a very viable system to be used as a general purpose asynchronous control structure. I-A. 1. The Petri Net Model Physical Structure of the Petri Net The Petri net is fundamentally a means of representing a system, and its behaviour, through the use of a directed graph. Named after Carl Petri, its inventor, he first called them "transition nets", probably due to their use as modeling a system as a sequence of transitions between states. The structural elements of a Petri net consist of three items: the "place", the "transition", and the "arc". function in the overall net structure. Each element has a specific The places, to be represented by circles ( see figure I-I A ) act as the elements which record the state of the system at a specific time. How the state is recorded will be explained shortly; it will suffice for now to say that the set of all places forms the state description of the system the net models at any instant in time. outputs inputs Fig. I-1 A) 2 P-f 2j 2 2 Representation of the "place". outputs inputs Fig. I-1 B) The transition, represented by a vertical bar,( Representation of the "transition". see figure I-1 B ) is the active element in the net, as they direct the control flow through the net, altering the state of the places as the "computation" the net is performing proceeds in time. The physical structure of the net ( i.e. its morphology ) 12 is determined by the directed arcs ( arrows ) in the net's construction. These arcs are used to specify the inputs and outputs of each place and transition in the net. By design, arcs connect places to transitions ( and likewise transitions to places ). possibility of connecting Logical sense precludes the like elements ( e.g., place to place ). As a syntactic convention, places whose arcs connect from the place to a transition are referred to as "input places to the transition". Corresponding arcs originating at a transition and terminating at a place specify the "output places of the transition". Similar definitions exist for the input transitions to a place, and the output transitions of a place. In general it may be inferred from figure I-1 that both transitions and places may have an arbitrary number of input and output places and transitions, respectively. Note that each transition and place must have at least one of each, however. Based upon these construction rules, figure 1-2 displays a sample Petri net. Figure 1-2) A sample Petri net 13 2. The Flow of Control through the Petri Net Representing the flow of control through a net consists of (1) specification of the structure of the net ( by places, transitions, and arcs, as above ), and (2) the use of "tokens" to model the actual asynchronous control signals proceeding through the system. The token will represent the presence of a signal at that point in the net where it is held by a place. ( Thus the appropriate name for the places - as "places where tokens may be held" ) In the most general type of Petri net, a given place may possess any integer number of tokens ( assumed positive, i.e., 0, 1, 2, ... ) We will restrict ourselves, however, to a system in which each place can contain either zero ( "empty" one ( "full" ) token. ) or This restriction simplifies both the implemen- tation of places and transitions; and as will be shown shortly, causes little loss in generality ( at the expense of some complexity ) classes of systems which can be represented by our schema. will be termed "safe" nets. of the Such nets Figure 1-3 details the schematic repre- sentation of places as they can appear in a safe net. Note also that at initialization time of the net that places may start in either the full or empty state. Initially empty places will generally be used as normal elements to pass along the control signals; initially full places will most often be used for semaphore and resource sharing applications. Further uses in this area will be discussed later in section I-C. Control flow is directed through the net by the "firing" of transitions. This action shifts tokens between places, thus altering the state of the system proceed accordingly. , allowing the computation specified by the net to The rule for firing a transition is extremely simple: 14 A. A full place ( 1 token ) Figure 1-3) B. An empty place ( 0 tokens ) Places and tokens in a safe net. if all the input places to the transition are full, then the transition is 'enabled to fire'. For any other combination of tokens in the input places of this transition, the transition is held in the wait state, disabled from firing until the above requirement is met. Firing a tran* sition then consists of the following operations: (1) simultaneously removing the tokens from each input place of the firing transition ( going from full to empty ), in each output place of the and when this is done, (2) placing a token transition ( going from empty to full ). This algorithm implicitly assumes the safeness of the net construction: all the output places must be in the empty state when the transition is enabled to fire; if this cannot be guaranteed, the net is unsafely constructed, allowing the possibility for two ( or more ) tokens to attempt to occupy the same place simultaneously. Figure 1-4 details the sequence for the firing of a multiple input and output transition. Notice that there are no time constraints imposed upon the time required to fire a given transition,,nor on the time that a token may reside in a place. These observations are fundamental to the asynchronous modeling ability of the Petri net structure. Returning to the important issue of safeness, it should be noted 15 A. Transition Ti is held in the wait state, the input places Pi to Pn are full. as not all Ti B. All the input places Pj to transition T, are now full; the transition is enabled to fire. rl C. Q1 Transition T1 has completed firing; all the input places PI to Pn are emptied, all the output places Qi to Qm are filled. Figure 1-4) The steps in the firing of a transition. 16 that this property is a fundamental requirement of the nets in our system. If an unbounded number of tokens could exist in any place, then any practical implementation would require an infinite capacity counter at each place to record its state; this is clearly not realizable in a real system. If we assume that the number of tokens at any place is at least bounded by some number 2, then we can ( by implementing an n-stage binary counter at the unsafe place, through transitions ans safe places ) transform any n-bounded unsafe net of this type to a functionally equivalent safe net. Through this process we can then represent any finite state system ( which can be modeled by a n-bounded unsafe net ) as a safe net, and therefore able to be simulated by our system. Section I-C, "Typical Net Construc- tions", details the functions that can be represented by the possible place-transition interconnections, and which of these can lead to problems regarding the safeness of the entire net. Technically, safeness can be defined as, given an initial placement of tokens throughout the net ( the "initial configuration" ), then a net is said to be "safe" if and only if any firing sequence of transitions yields no more than one token per place. In a large system, this can be a difficult criteria to establish; verifying safeness on the "subsystem level" and proceeding upward seems much more viable ( a structured programming type approach ). Safeness is best gained by careful speci- fication and design of the system; intuition also seems to help. 17 I-B. Our Control Structure Model The previous Petri net model for asynchronous systems provides a good theoretical base for system design and analysis. However, several important features are lacking which would be necessary to fully utilize the net facility to actually implement a useful asynchronous system in "hardware". The two major functions that need to be added to our system are (1) interfacing to the outside world, and (2) a method of altering the sequence of control based upon signals obtained from outside the matrix. With these added abilities, our matrix is then able to function with the equivalent functional complexity of the control circuitry of a cpu. 1. Communication with the Outside World Implementation of communication links with external systems ( circuits ) is desired along the lines ( for compatibility ) as first developed by Patil and his macro-modular asynchronous building blocks in (6) and Patil and Dennis' asynchronous modules in (2). In this manner asynchronous control signals can be passed to and from our simulator, so that it looks to the other modules as just another module in the system, freely interchangeable Signals in this asynchronous modular system consist of transitions ( not to be confused with the transition in the Petri net ) on the control wires from 0 ( low ) to I ( high ) or I to 0; each is a completely equivalent signal-:#- the change in level represents the signal. To be able to transmit and receive this signal, we have expanded the definition of a place to include a new subtype, called an "input-output place" ( or i/o-place ); the old place will henceforth be referred to as an "internal 18 place". The input-output place will be represented by the square as in figure 1-5. This place actually consists of two halves: the output half, which transmits the "ready" signal ( the level change, referred to earlier ) from the matrix control structure, along a control line, to the external asynchronous module; and the input half, which receives the corresponding "acknowledge" signal from the external module and enters it into the matrix. ( Note - we will not attempt here to enter into a full discussion concerning the ready/acknowledge signalling conventions that will be used. The reader is referred to reference (2) for Dennis' and Patil's presentation of this topic and its use with the asynchronous modular system. ) This new place can replace any previous titernal place in the Petri net structure; if it is desired to perform some external function at that time in the net's "computation". The internal place's function is now to act only as an internal status indicator within the confines of the matrix ( thus the name "internal" ). The input-output place is used to asynchro- nously activate external devices connected to the matrix; the internal place for such uses as resource sharing or lock-type semaphores. output half input half Ix 1N 1 2 >2 outputs inputs m Figure 1-5) n Representation for the input-output place. 19 Mechanically, the filling and emptying of the input-output place is directly compatible with that of the internal place, to insure their direct substitutibility. The i-o place works as follows: Assume it is initially in the empty state ( it will be at system initialization, as all control links, and thus the i-o place, are reset to the inactive state at this time ). Now let a transition, of which this is an output place, fire; thus a token will attempt to enter into this place. As this operation begins, the token enters the 6utputa( left ) half, and immediately a ready signal ( transition, level change ) is sent along the place's control link on the ready wire. The "official" state of the place is still empty, so as not to fire any transitions attached to its input ( right ) half prematurely; but the control link is in the active state now, and presumably the device attached to the link is performing its operation. Some arbitrary time later ( after it finishes ), the corresponding acknowledge signal is returned on the acknowledge wire, indicating the external device has finished. The link now enters the itiactive state, and the i-o place then enters the full state, indicating the presence of a token. same as an internal place. At this point, its behavior is exactly the Subsequently, its token will be removed by a transition being fed by this place, resetting it to the empty state, ready to begin another cycle. Thus communication with external devices or systems ( possibly even another matrix such as this one ) is handled in a very clean asynchronous manner, compatible with the previously developed signaling criteria. Figure 1-6 details a typical portion of a net containing an input-output place, and the flow of control that results from its holding a token. 20 r+ P3 TP2 P1 A. ia Internal place PI contains a token; transition T1 enabled to fire. Ready/acknowledge link on i-o place P2 is inactive. *** I ri P1 B. Tj a P2 T2 P3 Transition T1 has fired; token placed in output half of i-o place P2; ready signal sent on control link; transition T 2 not yet enabled. a ri P1 C. Tj P3 Acknowledge signal received on control link; token placed in input half of i-o place P2; transition T2 now enabled to fire. t P1.TjP D. T2 IP2 I T2 P3 TrRnsition T2 has fired, token placed in internal place P3 . Control link on place P2 is again in inactive state. Figure 1-6) Logical sequence for the filling/emptying of an input-output place. 21 2. Decision Handling in the Net The handling of decisions by the Petri net schema will again be implemented by a modification of the place structure. Decisions will be made on the simple true/false basis, which is easiest to implement and yet provides good flexibility for operating on digital binary data. The modified notation is detailed in figure I-7A for the i-o decision place, and in figure I-7B for the internal decision place. Both work in exactly the same manner as previously presented, except the appearence of the token into the true or false branch is controlled by the appropriate decision line from the external environment. In effect the output of the decision place ( either internal or i-o ) is directed, in a mutually exclusive manner, to either the true or false brandh depending upon the status of the decision line. How this is physically accomplished by the net simulator will be detailed in a later section on the actual physical design of the matrix circuitry. Figure 1-8 shows a sample Petri net execution of a conditional place where the decision line held a 'true* status. inputs outputs inputs T. A. Representation of the input-output decision place. Figure 1-7) The decision places. outputs T B. Representation of the internal decision place. 22 El il 9P2 ---- F P1 Tl J2 A. Place P1 contains a token; transition Tl enabled to fire. T T2 P4 P3 ~ T2 P1TP)V~ P3 B. Transition Ti has fired; decision line affecting placement of token I--- P1 Tj P2 C. Token placed according to decision ( true ); transition T2 enabled to fires transition T3 held in wait state D. Transition T2 has fired Figure I-8) Logical sequence for the filling/emptying of a decision place. ( Internal place used in example; could equally well have been an input-output place for P . ) 2 23 I-C. Typical Net Constructions Present in the complete repetoire of possible place-transition interconnections are several constructions that deserve special mention, either due to their usefulness in illustrating a specific point, or displaying a necessary restriction on the class of nets that can be represented by a 'safe' system. 1. Creating Tokens Tokens can be created for use on concurrent execution paths through the use of a transition with multiple outputs, as in figure 1-9. This operation is analogous to the "fork" operation used in other asynchronous system descriptions. that the conservation of tokens, Note that this function illustrates by number, is not required on the level of a single transition ( tokens will only be conserved if the number of input places equal the number of output places for that transition ). However, for the net to remain safe, these tokens must somehow be collected at some later time and merged, so that around any loop in the net it is a necessary condition that tokens be conserved, but not in general sufficient. Sufficiency will be guaranteed by correct ( in regards to safeness ) construction of each parallil sub-branch of the loop. SP2 P2 A. Before T1 fires Figure 1-9) B. After Ti fires Creating tokens (concurrent paths ) with a multiple output transition. 24 2. Merging Tokens In a similar method to that employed above, tokens can be merged from parallel, concurrent execution paths through the use of a multiple input transition. This corresponds to the "join" or "logical and" operations employed in other systems. Note that, by definition of the transition operation, all the input places must be full before the transition is enabled to fire ( thus the logical and analogy ). Figure 1-10 illustrates concurrent path merging with a multiple input transition. Pi P1 0:T l T l P2 P2P A. Before Ti fires Figure I-10) 3. B. After Ti fires Merging tokens ( concurrent paths ) with a multiple input transition. Splitting Paths The function of providing a choice of multiple execution paths in a net can be done in two manners. The first, discussed previously, provides for concurrent execution of each of a number of parallel paths through the use of a multiple output transition. This second method to be presented here enables the designer to select one of these parallel paths to execute, either arbitrarily or by distinct choice. Selection of a specific path from a set of choices based upon testing conditions is done using the true/false conditional places presented previously. The token will be directed to the desired branch 25 depending upon the status of the decision line. An arbitrary path can be selected using the multiple output facility of a place into several different transitions. This compli- cates the firing rule for transitions, however, as there will now be contention by the output transitions for possession of this places' token. Figure I-11 illustrates the problem. Tj P1 When place PO becomes full#, then by ITj PI T2 P PO PO PO T2 T2 P2 B T , E P T2 P2 2 AT2 Figure I-11) Selection of mutually exclusive paths thru a multiple output place and an arbiter. our previous rule both transitions TI and T2 are enabled to fire. cannot happen, however, as we have assumed that (1) full ( one token ) or empty ( no tokens ), and (2) a whole token for itself to fire.. This either a place is a transition requires We could not therefore split the token in place PO in half, and give half to each transition. To resolve this difficulty, we have created the concept of an hrbiter', which is 'attached' ( theoretically ) to all the output transitions of this place ( in theory there may be an arbitrary number, example ). not just two as in the This over-seer of the connection then arbitrarily decides which transition is able to capture the token, and subsequently fire; so that the net setup in figure I-11A can yield either the upper or lower version in I-11B after a transition fires, depending upon the 'decision' 26 of the arbiter. The arbiter should decide with no prejudice which transition will receive the token; all should be equally likely. How this operation is physically accomplished will be discussed in the next section on the implementation of the transition circuit. The use of this construction is extremely important in the development of resource sharing control circuitry. For example, to lock out mutually exclusive operations,( e.g., read and write commands to a memory device ) the net structure of figure 1-12 may be used to share the resource referenced by both subnets N1 and N2. explains its operation in detail. The annotation under the figure Note here the useful quality of being able to initially specify a place be full ( the internal place resource sharing semaphore ). P1 Ti T3 P3 Ni P2< J2N T4 O P4 Figure 1-12) An arbiter. Either subnet N1 or N 2 may execute based on whether a token is present first in Pl or P2 respectively. Presumably the operations in N 1 and N 2 are mutually exclusive. If tokens arrive simultaneously, the arbiter between transitions T1 and T 2 decides where the token in PO will go. The branch chosen steals the token in Po ( a semaphore ), inhibiting execution of the other branch until it is done; when finished, it replaces the token, resetting the arbiter for the next cycle. 27 4. Merging Paths At some points in the net it might be desirable to merge several paths into a common path not with an 'and# function, as previously presented for merging tokens, but rather with an 'or' function, so that a token arriving on any input branch will produce an output. This will especially be used when one of a number of concurrent paths is executed, and it is then desired to merge all these into a single main path.( i.e., rejoining mutually exclusive paths after a conditional branch; the "if cond then N, else N 2 function, see figure 1-13 ). The use of a multiple input place accomplishes this function. Note however that this construction places a constraint on the design of the net so that it remains safe. The designer must insure that control is never given to more than one of the parallel paths at any time, or indeterminacy can result, as a transition ( TI to Tn in figure 1-14 ) could then possibly fire into a full place ( PO ). Figure 1-14 illustrates the use of the path merging function. PI IN - N cond P2 F Figure 1-13) N2 Merging structure corresponding to the "if cond then N1 else N2 " construction. 28 P1 Tj P2 T2 PO Ph Figure 1-14) 5. Tn The general path merging function using a multiple input place. Note that only one of places PI to Pn may be full at any one time to insure safenebs of the net; i.e., that a transition will not fire into a full place ( in this case, PO ). Other Useful Functions In this section we will present some other useful constructions and functions performable by place/transition interconnections that should be very useful in developing Petri net control systems. Figure 1-15 details the operation of the "lockout switch". Here place P1 insures that only one token entering from P2 can get into net NI at a time. The lockout switch is reset only when the token again leaves net N1 , continuing execution with place P3* This construction is useful for applications requiring serially reusable resources, or implementing locks on control sections that must be executed in entirety without being interrupted or restarted. Synchronization of parallel paths can be performed very cleanly with the use of a multiple input/multiple output transition. In general, with an m-input/n-output transition, one could insure that execution on each of the m input paths had been completed before starting any of the 29 m output paths. Figure 1-16 displays this operation. Starting and stopping the net can be done by, respectively, a place with no inputs that is initially full ( the "start place" either a place or transition with no outputs. ), and This is a special change of the rule that each place and transition must have at least one input and one output arc$ however, these functions are convenient, and this method provides a very simple implementation. The net in figure 1-17 illustrates the use of this facility. T2 P2 T3 P3 NI Figure 1-15) The lockout switch. Subnet Nl is prevented from being restarted while it is executing. Ql P1 Tj Figure 1-16) Synchronizing m input paths forming n output paths. OFigure I-17A) A start place. ~HO Figure I-47B) A stop place. 30 I-D. Matrix Notation for Petri Nets A very convenient matrix notation for Petri nets has been devised by Patil. Very simply, it is constructed by enumerating all the places ( including internal, input-output, and decision lines ) present in the system along the column heads of a table, and likewise all the transitions along the row heads of the table. Connections within the net are then displayed as elements in the table using the following code: (1) an empty block, to represent no connection (2) a dot ( ), to represent an arc from a place to a transition, (3) a cross ()), to represent an arc from a transition to a place Note that this code leaves out the possibility that the same place may be both an input to and an output from the same transition. However this omission causes no loss in generality ( it can easily be simulated by using a loop of two places and two transitions ), and thus will be readily accepted, as it immensely. simplifies representation of the loop This code and table organization easily handles all the input/output possibilities for the transition, and the internal place. each only requires only one row ( or column ) to specify its inputs and outputs. However# the table must be extended when we add the input-- output place and decision capability, as each of these requires two columns by itself to uniquely represent its function in the table using our code defined above ( and as will be seen later in the section detailing the actual implementation, this choice of representation simplifies the circuit requirements for the input-output place and decision lines ). The two columns for the i-o place specify the output half and the 31 input half respectively. it The first column may only contain a cross, may only be the output from a transition ( the output half ). second column'( as The the input half ) may likewise only contain a dot, denoting its function as an input to a transition. The decision lines similarly require two columns, one each for the true and false branches. Each column may contain only dots ( i.e., show- ing inputs to transitions ), and placed at the row intersection of the first transition(s) of each branch ( true, false ). Note that each row that contains one of these "decision dots" must also contain at least one dot under a place column, representing an arc from a place to a transition. If there is more than one input to this transition ( a dot ) and a decision dot is also present, then this transition will fire only when all the places are full, and the decision is satisfied ( as would be expected ). Figure 1-18 presents a complete sample system, to perform an asynchronous fixed point multiplication of two binary numbers by the "shift and add" method. Note the system operates completely asynchro- as the time to perform the emputation varies with the input nously, numbers. Figure I-18A details the data flow portion of the system. The control structure ( in Petri net form ), is presented in figure I-18B. Observe the use of several of the constructions presented previously: creating and merging parallel paths; creating and merging mutually exclusive paths; starting and stopping the net; ordering parallel and sequential operations; use of decision branches to change the flow. The schematic representation of the control structure in figure I-18B is then easily tabulated in matrix form in figure I-18C. notation will be the basis for the "program" This matrix in the programmable version. 32 A. DATA FLOW input REG A FAN IN A + C: E REG reuC dd 7 S ift Left CC b input B FAN REG C. 0 ? =0 fift Rightf Be CONTROL FLOW ho or 13=0 Figure -18) AShiftdasnhoosmlilefrtwbnry T1-+ T,~~T T LOAD ~ SRIR*A Figur-0 - ) A440/8=aynh1 ~ two's~ fixed e ouutil pointf ~ Ad f rt o i ay TopeetnmerC=A*B 33 I Start 0 * X I:I Places and Decisions P3 P5 P6 P4 o II C ~I 0 I1 - 02 0 I o D P7 I1 T IF T F X * Tj * X T2 * * X X x X x * X * * X Start place ( input only ) T3 *T4 * * * Input-output Internal place, place two columns X * T5 T6 T7 * Stop place ( output only ) T8 Transitions Decisions, two columns each Figure I-18C. The matrix representation of the asynchronous multiplier. 34 Summary This section has presented the detailed development of a tool the Petri net, as we have modified its structure - for use as both a theoretical and practical model for asynchronous systems. of its The development representation by the matrix form, as a type of "program" for the places ( internal, input-output, and decision attachments ) and transitions, will now be expanded, in the next section, to a hardware structure that is capable of simulating any Petri net representable in this form. next section is devoted to this task. The 35 II. The Programmable Array Design Objectives Our goal is to design and build a practical asynchronous control structure whose program may be changed electronically. As it turns out, the array representation of Petri nets described in the previous section An electronic analog of this structure, is an ideal basis for design. with the capability of electronically altering the pattern of intersections in the array, would be able to simulate any Petri net that the array could represent. It could thus perform the actions of any control struc- ture that the array could simulate. It will have the following characteristics: a blueprint for our design. 1) Thus we chose that representation as There will be a row of places across the top of the array,some of which will interface with the outside world via the ready-acknowledge scheme described by Dennis2 and Patil2,6 ternal places. demonstrative, I/O places, and some of which will be in- To keep the structure small but still it was decided to design it with four four internal places, and two decision lines. 2) There will be a column of transitions along the right side of the array. These will be able to be input or output transitions of any place. For reasons ela- borated later, each pair of transitions will have an arbiter between them on their inputs. eight transitions in the design. There will be 36 3) There will be a switching matrix to route the signals (in effect, the tokens) from places through transitions back to places. It will be basically ver- tical wires that are inputs and outputs from the places, and horizontal wires that are inputs and outputs to and from the transitions. In Patil's ori- 8 ginal version, the pattern of interconnection of the vertical and horizontal wires is determined by the physical placement of diodes. The diodes are to be replaced with transistors, with the resulting ability to make the transistors act as open circuits or as diodes by turning them off or on. 4) An array of registers will be used, the outputs of which will drive the transistors in the switching matrix. The contents of this array is alterable, and thus so is the Petri net being simulated. This portion of the structure is expensive, and thus was a constraint on overall size. The resulting structure (figure TI-1 shows the general layout), can simulate a wide variety of Petri nets, and thus can be used with the modules already built in the Computation Structures Laboratory to use the ready-acknowledge signalling convention to perform a number of operations. The structure itself, while not being directly expandable, can be interfaced with other similar structures of any size through the I/O places, and thus structures of any size may be built. 37 control links to outside world > ra ra ra decisions II [ I/4 start place ra laces jjplaces transitions ]1 4' SWITCHING MATRIX lUTUIT REGISTER ARRAY Figure II-1. Overall system organization. TtIU 38 II-A The Switching Matrix The specification of the Petri net to be simulated by our system lies in the interconnections enabled by the switching matrix. array is operating, While the the place and transition circuitry perform all the logical operations to simulate the desired net, with the switching matrix directing the signals. How these signals are represented and directed is the topic of this section. Requirements The design objectives require the switching matrix to have the ability to connect the outputs of every transition to the inputs of every place, and the outputs of every place to the inputs of every transition. As such, the matrix is nothing more than a specialized crossbar circuit that can be individually enabled at each intersection. Furthermore, we also desire that to easily implement the correct functionality of the place and transition elements: inputs to places from transitions use the 'or* function for merging signals, sition may fill/empty any place, so that any tran- without concern of what the other in- active transitions attached as inputs to the place are doing; and that inputs to trinsitions from places use the 'and' function for ierging signals, so that all the input places of a transition must be full before it may fire, and that it fills all its output places before it finishes firing. Implementation At first these requirements seem prohibitive to constructing a cost effective switching matrix, but a simple method of doing so has been 39 developed. It works using the following signaling conventions, and the interconnection structure of figure 11-2. For the lines to send signals from the transitions to the places, we use active high levels. Thus the level on the output line will be high ( = 1 ) only if all the connected input lines are high ( = 1 ). implements the required 'and'ing function. this principle. This Figure 11-2 can illustrate If any of input lines Ll, L 2 , or L3 are low ( = 0 ), output line L2 will be pulled low ( = 0 ) throvgh the forward-biased ( conducting ) diodes D2 ,1 , D2 ,2 , and D2 ,3 tespectively ( which provide isolation from one input line to another, to prevent crosstalk ). Only input lines are high ( = 1 ) do we get a high ( = 1 ) on the when all output line, as all the diodes are now in a reverse-biased, nonconducting state. Thus our active high 'and' function directs signals from places to transitions. Similar to above, for the lines which send signals from the transitions to the places, we use active low levels. Thus the level on the output line will be low ( = 1 ) if any connected input line is low ( = 1 ). The output level will be high ( = 0 ) only if input lines feeding it are high ( = 0 ). 'or'ing function. as above, all the This implements the required Figure 11-2 can be used again, with the same argument except for now using active low levels, to direct signals from transitions to places using the 'or' function. Note also the other connection possibilities present. For instance, input line Ll will never affect output line L3 ( or Ll ) as there is no diode present at the junction where they cross to pass the signal ( in figure 11-2 ). Using the technique of inserting diodes at the desired 40 +5 r~w ('W /wv 1k L3 +5 D 3,2 f 1k OUTPUT r - - - L2 4 4/4D 2,2 7-,3 D 2 ,1 LINES (typical) +5 1k Ll ,D . ** 1,#2 A,Ll IL3 INPUT LINES (typical) Figure 11-2) A typical portion of the switching matrix. Note the orientation of the diodes to transfer zero-positive level signals from the input to the output lines, and their ability to prevent crosstalk between input lines. 41 junctions, we can thus direct the signals between the places and transitions selectively, so that any well formed Petri net structure could be represented in the switching matrix. Patil in his hardwired array discussed in (8) used just such a system ( with diodes on plugs, that could be mechanically moved to change the net structure ). We have expanded this system in the manner illustrated in figure 11-3. At the junction of every input and output line in the matrix, we have replaced a possible hardwired diode with a transistor, oriented as in figure 11-3. The signal 'flows'through the transistor from emitter to collector, much as it 'flowed' through the diode from anode to cathode. However, the use of a transistor has an important consequence. When held in the nonconducting state ( off ), the transistor looks like an open circuit across the junction, so the input signal ( at the emitter ) has no affect on the output line ( at the collector ). Thus we have a 'nodiode' connection. Held in the conducting state ( on ), however, the transistor now behaves like our diode placed across the junction as before, with the same influence of the input line on the output line. We now have a programmable diode at the junction. The transistor we have used at the junction are 2N5134 type high speed silicon NPN saturated switching transistors, which possess the desired physical characteristics, and yet are also very inexpensive. ( 9e each at this writing ). However, the use of the transistor switches to simulate the diodes requires a memory to specify the status of each switching transistor in the array - either on, representing a diode, or off, providing an open circuit condition. TTL levels available from the memory can be used to switch these transistors very nicely. A TTL logical 42 +5 AA/ 1k 13 +5 1k J Q _3,3 Q 3,1 Q 3,2 OUTPUT L2 +5 LINES (typical) S1k Q 2,1 Q 22 Q 2 ,3 4 x to memory FF*s Q 1,3 I Q l,2 Q 1,1 Li L3 INPUT LINES Note: Transistor bases are connected to individual memory flip flop outputs (typical) Figure 11-3) Replacement of the hardwired diodes by transistors results in the above structure for the switching matrix. 43 0 ( approximately +0.2 volt, the off, tion'. and a current sink ) holds the transistor in or nonconducting state, providing the open circuit, or 'no-connec- Likewise a TTL logical 1 ( approximately +3.8 volts, 1.6 mA DC current source ) provides ample base current to switch the transistor on, conducting, and establish a 'diode-connection'. The implementation of this memory circuit ( as an array of registers ) is detailed in a later section. Interface Circuitry We desired to finalize the design for the switching matrix as in figure 11-3, using the transistors, because this was the simplest organization that provided a workable solution. However, after extensive testing to insure the correctness of our design, we found that the simple one transistor/junction switching matrix we intended to use did not always pass TTL levels in a usable form from input to output. Thus the interface circuitry described below is required to convert the matrix signals back to TTL compatible levels. On an input line to the transistor matrix, we found no problems in directly interfacing our active high or low TTL levels into the switching transistors. A standard TTL gate in the low state is able to sink up to 16 mA of current, and this could easily ground the emitter connections of each transistor connected to the input line. No excessive current flow problems were found even when all the transistors on a given input line were turned on ( this amounted to about twenty transistors loading the line ). Likewise, when the TTL output was high ( +3.8 volts ), ing was small enough so that no extra interface circuitry ( i.e., buffering ) was needed to maintain this high level, number of transistors ( twenty ) turned on. the loadextra even with the maximum Figure II-4A details the 44 Input Interface: TTL circuitry Figure 11-4 A TTL logic to transistor matrix Matrix input line F# !3 *A- - +3.8- +3.8 - +02 L - 1 2 +0.2_4 A 3 4 1 Figure Output Interface: Matrix output L 2 B 3 4 HI-4 B Transistor matrix to TTL logic TTL ircuitry LM3900 I hi line iLJ 7413 Ll + +4 V IM +5 44+3' ... ... 1 ..2....... +5.0+3.8+0.6~~~~~~~1 12 C 3 4 I +0.2- a 2 4 1 2 E 3 4 45 interface circuitry ( none ) and waveforms on an input line to the switching matrix. We did not investigate the problems that might be encountered with a larger number of transistors loading the input line, as would be possible in a larger switching matrix. Probably some further type of buffering of the TTL levels would be required to provide the required current sinking and open circuit high voltage capability, but this remains to be investigated by those desiring to construct a larger array. Problems in recovering the signal intact, as a TTL usable level, on the output lines arose, however. When only one transistor was turned on ( at the junction of the input and output lines ), the signal propogated intact from input to output, as was expected. than one transistor, However, when more on either the input or output lines, was turned on, a high level on input stayed at +5 volts en output, its pullup level; but a low level rose from the normal value of +0.2-0.6 volt to +3 volts. signal was getting through, but it was level shifted upward. The This effect, though we did not analyze its cause completely, seems to be due to the existence of alternate current paths through the transistors, because of their somewhat bidirectional current flow nature. The problem was not present in a system using only unidirectional flow diodes ( Patil's hardwired diode-plug array, in (8) ). found to be +3 volts. The limiting value for a low was To remedy this situation, the interface circuitry in figure II-4B was developed to convert this +5/+3 signaling system back to the standard +3.8/+0.2 TTL levels respectively. Basically just an inverting voltage comparator ( using an OP AMP ) centered at -+4 volts ( halfway between minimum high level and maximum low level ), it has worked 46 reliably in breadboard versions, producing clean TTL level output ( corresponding exactly to the input ) no matter what transistors are on; at frequencies from DC to megahertz. Figure II-4B details how the waveforms pass through this interface circuitry. Summary We have specified here the design and practical implementation of both a hardwired ( using diodes ) and programmable ( using transistors ) switching matrix for routing the bidirectional signals from places to transitions, that is extremely simple, yet very versatile. The next sections will now develop the circuitry for the places and transitions, and the memory to control our transistor matrix, so that we may combine all these elements into a total Petri net simulator system. 47 B. The Implementation The Switching Matrix and Signalling Constraints The matrix consists of horizontal and vertical wires, some being inputs to the matrix, the rest being outputs. Vertical wires conduct sig- nals to and from places, and horizontal wires conduct signals to and from transitions. As described previously, transistors are used at intersect- ing vertical and horizontal wires to perform a logical and, function. Thus the state of each vertical output wire is the logical and of the states of each horizontal wire that is connected to it by an on transis- tor, and the state of each horizontal outnut wire is the logical and of the states of each vertical input wire that is connected to it transistor. its by an on Thus the only function that the matrix can perform between inputs and outputs is the and function. Coupled with the properties of places and transitions, this fact constrains the type of signalling that can be used between places and transitions, as explained below. Transitions By the definition of the transition, when it all of its input places, and then fills would appear to require four all of its fires, it empties output places. This from the matrix to each transition: lines one to indicate when all input places are full; one to indicate when all output places are empty; one to empty all of its input places; and one to fill all of its output places. It is not too constraining on net struc- ture to require that all output places of a transition be empty when all of its input places are full. The implementation is then simplified. If the line indicating that all input places are full is high when 48 this is true and low otherwise, nal wires as follows: each transition. then one can combine the last two sig- Let there be only one input to the matrix from Since the wire indicating that all a transition's out- put places are empty can be eliminated because of our constraint, is only one matrix output to each transition-all input places are full. its (some the wire indicating when When this wire is high (all are full) the input to the matrix is low. there input places When the matrix output is low, input is now empty) the matrix input from the transition is high (figure 11-5). The reasons for the inversion will be explained later. One could then design the places so that onthe high to low level change of the transition output wire that transition's input places are emptied, causing the transition input line to go low, thus causing its line to go high, and this level change can be used to fill output the transi- tion's output places. This is not quite enough, ure 11-6. however, Consider the situation in fig- Because input places a and c are full, emptying places a and c. Suppose, however, transition b fires, that place a empties faster than place c, so the transition input line goes low, causing transition b to place a token into place d. tion sees both its Place c has not yet emptied, so transi- input places(d and c) token into place f. full, and thus fires, putting a The final situation should be a token in place d, however. Therefore, another line is added as input to each transition. This line is high if low otherwise. x line. all the input places to a transition are empty, The line indicating that all and input places are full is the The line indicating that they are all empty is the y line, and 49 full indicator > from transition's input places. signal causing removal or insertion of tokens. Figure 1I-5. Simple transition. a B: a A: c b a C: b b c c d d 0f d d e f Figure 11-6. f Example showing inadequacy of simple transition. 50 the transition output line is the z line. terized by the following behavior: The transition is now charac- When line x goes high (all input pla- ces full) line y must be low (since no input places are empty), and output line z goes from high (quiescent) to low (firing). empties the input places, goes low. This level change and as soon as one of them is empty, line x Line y goes high as soon as the last input place is emptied, then output line z goes high again, causing tokens to be placed in each output place. A complete circuit that illustrates this behavior, with its input and output lines labelled as mentioned, is shown in figure 11-7, along with a timing diagram. It includes an LED indicator to show visual- ly when the transition is firing. This is still not quite complete, shown in figure II-8A. however. The tokens ina\nd c Consider the situation have arrived simultaneously. According to standard Petri net theory, after any action the situation should be one of the two cases shown in figure II-8B, with only one of the transitions having fired. At present, however, both transitions would be enabled as their conditions for firing would be met, and so both would fire simultaneously. It is not at all clear what the final state would be. The situation pictured, then, calls for some sort of arbitration on the firing of transitions so situated. Inputs to the arbiter must be both the x line from each transition involved and the z line, since the x line may go low before all the input places are empty, and one would like to keep the other transition(s) blocked until the firing transition has completely finished firing. arbiter pairs of transitions. For reasons of simplicity, one need only Therefore each adjacent pair of the eight 51 x (from slitching matrix) normally 0 rO+ y > IN (from swin 330 ohms normally I z (to swt ichng matrix) normaly 1 Timing: ----------------- x //0 y z Figure 11-7. A better transition. -1 52 A: d a f b g C * B: a a b or e C d bS 8 e g C.@ Figure 11-8. Situation showing necessity of arbitration. 53 transitions implemented is arbited as shown in figure 11-9. The 74H60 nand gate is used as a threshold detector on the output of the set-reset flip that makes up the arbiter, to ensure that no action is taken if the arbiter is in a metastable,(see Patil ). Since in any situation other than a place feeding two transitions one would not want any arbitration, another input, the select input, is added, which enables or disables the ar- biter. Places and signalling Communication between places and transitions is through the array structure. Thus inputs to transitions are solely functions of place out- puts, and inputs to places are solely functions of transition outputs. This function is the logical and function, and it on how places and transitions communicate. imposes constraints For example, consider a place that is an output for more than one transition. filled whenever any of its input transitions fires. the place that tells it to fill It should be Since the input to can only be the and of the z outputs from each of its input transitions, these z lines must be normally high. Then when any one of them goes low (its transition fires) the input line to the place will go low, filling the place. Thus the constraint men* tioned earlier that the z lines be normally high is not arbitrary but necessary. A further constraint that was mentioned earlier was that input places of a transition should be emptied on the falling edge of the z line output of the transition, and output places from the transition should be filled on the subsequent rising edge. This convention guaran- tees that input places are emptied before output places are filled. 54 (to switch ng matrWx normally 1 +5 74H40 - Yi (from switcoing normally 1 0 ms (from switc ing 74H60 nmtrMx normaCly 0 select O=norat R74IH40- larbitrate +5 74H60 IK ohms x2 Y2 z2 74H40 ' Timing is the same for each transition Figure 11-9. above as for single transitions. Final transition with option for arbitration. 55 Figure II-10 shows the importance of this convention. II-10A. Consider figure Transition b fires, causing its z line to go low. falling edge is used by place c to become full, place a has emptied. full causing it it If this may do so before Then transition d could see both places a and c to fire (figure II-10B and C). If the place uses the rising edge to become filled, no ambiguity results, and the correct situation of figure II-10D results. The Input/Output Place In this system the output place is paired with an input place to form an I/O place, or external place, and this is used to interface with the outside world via a ready-acknowledge control link. The output half sends a ready out on the link, and the input half receives the acknowledge from that link, then indicating to the matrix the presence of a token. The output half only sends out a ready/on the receipt of a signal (the rising edge of a z line from some transition that is an input to this I/O place)that indicates a token is being put into the I/O place. It need furnish no information to the matrix, since the presence of a token will not be indicated until the receipt of the acknowledge, thus that information can be furnished by the input half. and The matrix output into the output half of the I/O place must of necessity be the logical and of some number of transition z lines. henceforward any matrix ouput to a place, it is the logical and of z lines, it downward pulse generated by any of its This output, and is called the a line. will normally be high. Since Thus the input transitions upon firing will appear on the a line to the output half of the place. The rising 56 At B: a C: a b b c b C. C d d e e D: a d e. a b d e, Figure II-10. Demonstration that input places to a transition must be emptied before the output places can be filled. 57 (trailing) edge of this pulse should change the level on the ready line, making the control link active. figure II-11. A simple implementation is shown in At initialization the ready will be set to the low state, as will the acknowledge line, the link will be inactive. so initially The input half of the I/O place must do two things. It must indi- cate the presence of a token after receipt of the acknowledge, and it must be emptied on receipt of a signal on its own a line from the matrix. this signal, as discussed previously, is a high to low transition in the z line output from some transition that is an output of this I/O place. Since the a line is a logical and of z lines, this produces a high to low transition on the a line (the tacit constraint made is that only one Else of the transitions connected to a given a line is firing at a time. the level changes could overlap destructively). This high to low transi- tion empties the place, which as described earlier contributes to the subsequent low to high transition on the a line. This half must also provide the matrix with a signal indicating the presence of a token, and a signal indicating the absence of one, x and y inputs to each transition in its output. place be high if the place is empty, low if full. for the Let the b output from a Let the c output from a place be high if the place is full, low otherwise. Therefore the x input of a transition is connected via the matrix to the c line from each place that is an input to that transition. nected to the corresponding b lines. all the connected c lines are high, input places are full. The y line is similarly con- Thus the x line can only be high if in other words, all the transition's Likewise, then, the y line will be high if all the connected b lines are high, meaning that all that transition's input 58 A ready falling edge trigger - Jk R init alize -'20ns Timing : I a -~ - I a (from switching matrix) normally 1 Figure II-11. The output place. ready - --- -0 59 places are empty. The circuit used to implement this half of the I/0 place is shown in figure 11-12. It has the desired switching proper- ties, and provides the desired information on the b and c lines. The LED will be on whenever the place is full. Figure 11-13 shows the matrix connections for a transition having an output half of an I/0 place in its output. Figure 11-14 shows the same for a transition having an input half of an I/0 place in its input. The Internal Place This place does not interface with the outside world, but is both filled and emptied by signals on its a line from the switching matrix. Since there is only one a line no overlap between signals to empty or fill the place is allowed. Therefore the place can only be filled on the trailing edge of the signal from its input transition. Otherwise, the place would indicate full before the a line has returned to the high level. empty it Then one of its output transitions could attempt to before the a line has returned to the high state, the two signals. overlapping It must also provide for the matrix the same infor- mation that the input half of the I/0 place must provide. The only further constraint is that one of its input transitions does not fire while one of its output places is firing. This constraint is no pro- blem because it is built into any net we intend to simulate anyway. The circuit to implement this is shown in figure 11-15. the Q outputs of both are high it is full. When flip-flops is low, the place is empty, when both Since it is at times usefull to have tokens pre- sent in certain places when a net is initialized an input(select initial 60 acknowledge 47pf falling edge +5 trigger C-1 KQ - A 3 Ck C initialize a (from switching matrix) normally 1 b c (to switc hing matrix) normally normally I 0 Timing: 1 acknowledge a a 30ns b F Figure II-12. The input place. bt\25ns In ohms 61 ready line Output place 4W transition x rest of matrix z r Figure 11-13. Circled intersection shows where connects the intersecting wires. transistor Matrix connections for transition feeding output place. acknowledge line Input place a b c tran sition 10 . .. . . _ .rest of matrix . Figure 11-14. f I x I]30 Circled intersections show where transistors connect the intersecting wires. Matrix connections for input place feeding transition. 62 initial state select-- 0=empty I=full both ff 's are falling edge triggered 5 330 ohms Q Q p- R initialize K Ck R P J J Ck K 5 C b (to switching matrix) nomaIly normally a (from switching matrix) normally 1 Timing: 1 a c ' I I j b 25ns Figure 11-15. The internal place. 63 state) is added so that at initialize time one can either reset the flipflops (empty place) or preset them (full place). Figure 11-16 shows the matrix connections necessary to connect an internal place with its in- put and output transitions. Decision lines The decision line is simply a wire that is high to indicate true Figure 11-17 shows the physical realization and low to indicate false. and shows that from the one line from the outside world two lines enter the matrix. The T line is high if the decision line indicates true, low The F line is high if the decision line indi- if it indicates false. cates false, low if indicates true. it be used to alter flow of control. Figure 11-18 shows how this may If the place is full, normally both ti and t2 would be enabled to fire, as both would have high x lines. In this case, however, if the decision line is high (true) the T line is high and the F line is low. and t2 can not fire. Thus the x line input to t2 can not go high, Since the T line is high the x line input to tl allowed to go high, and thus tl is allowed to fire. If the decision line is low (false) the reverse occurs and t2 is allowed to fire. Thus branching can occur in the flow of control. The Start Place Since this is actually a physical realization of the control structure, there must be some means for starting the net operation after initialization. This is done with a pushbutton operated place called the start place. transitions, to which it It acts as the input place to one or more appears to fill is and then empty. Its imple- 64 Internal place al C b transition _Iy I_ x __ ____ rest of matrix transition / *1 lIt I I x ___ I z Circled intersections show where transistors connect the intersecting wires. Figure II-16. Matrix connections for the internal place. 65 decision V 1=true O=false +5 3 ohms 7y T (both to matrix) F Figure HI-17. The decision line. 66 decision 'I Place a b F c I transiti on ti x i % _ _ __ z 'IV transition t2 10 ____)__ -I ft Figure II-18. r _ __ x z Circled intersections indicate where transistors connect the intersecting wires. Operation of decision line. 67 mentation is shown in figure 11-19. On pressing the start button, the one-shot produces a 300 microsecond high pulse on the c line, and a 300 microsecond low pulse on the b line. ing of a transition's input place. This emulates the filling and emptyThese b and c lines are connected to the x and y lines of transitions in the same manner as those from any other place that is in a transition's input. Thus when the button is pressed, the z line from any transition this place is connected to will be low for about 300 microseconds, the rising edge of which can be used to put tokens in places in the rest of the structure. 68 +5 5 4.7K monostable multivibrator T 0+5 Q ohms +5 pushbutton switch shown in normal position. c b (both to switching matrix) normally: 0 I Timing: L a I I I iii C I. I 1 I I~. I b tv35ns Figure 11-19. The start place. ,_v300us ---- 4 0 69 II-C. The Register Matrix The heart of the programmable feature of the programmable array is the transistor switching matrix, with its ability to be dynamically altered, adjusting the interconnections between the places and transitions to simulate any well defined Petri net that, of course, fits on our simulator with regard to the numbers of places and transitions. To control this transistor matrix, we have designed and constructed what we call the register array. This array is a fully address decoded memory, but with the additional feature that each bit in the memory is continuously able to be accessed from the outside. The input addressing circuitry is required to be able to efficiently enter data into the registers through a single input port. Thus the initial entry of a net "program" or subse- quent minor changes can be easily accomplished. This method of organization will also greatly simplify the development in section III of the memory controller for the multiplexed array simulator. bit is necessary so that each transistor group input leads of one or two switching transistors The output access to each ( consisting of the base ) in the matrix is held in the desired state ( either on or off ) continuously while the system is running. Organization The register array logically consists of two separate parts - the registers, and the address decoding and strobe circuitry. The registers are a collection of D-type flip-flops connected as in figure 11-20. This figure details the type of connection present at each of the thirty intersections of rows and columns in the matrix. There are fifteen columns; one for each of one start place, four input places, four output places, two decison lines, and four internal places. There 70 are likewise two rows; one for the 'top' four transitions ( tl through t4 ), and one for the 'bottom' four transitions ( t5 through t8 ). This method of having to load multiple transitions at a time ( i.e., four per row, instead of one ) was due to the fact that the available register chips ( 74175 QUAD D FLIP-FLOP ) had a common internal clock, and thus individual flip-flops could not be loaded easily. Thus one column inter- secting four rows ( four intersections ) would have to be loaded at a time. Note also that the decision lines and internal places each require two bits to specify their intersection connections ( three possible ); the start, output, and input places each require only one ( only two types of intersections ). At these intersections, an extra chip ( four more registers, for a total of eight ) was inserted to hold the four intersection by two bits per intersection, eight bits total, data. The data input switches/lines number eight, and are grouped in four pairs of two; one pair for each intersection. If the intersection requires two bits ( decision, internal place ) both switches are used. Only the low order bit switch is used for the intersections that can be specified by a single bit ( start, input, output places ); The setting of the high order bit switch is irrelevant. Operation The operation of loading data into a selected register ( i.e., for four intersections ) is very simple. Examining figure 11-20 note that each row and column address line, plus the memory strobe line, are normally low; thus the clock input to the flip-flop is held at ground. This clock input is the wired land' of these three lines - all must be high before the clock will go high, which clocks the data lines into the register on the rising edge. To select a specific intersection, the 71 COLUMN SELECT LINES I I 0I 2 3 0 6* 14 \V I > ROW SELECT (typical intersection) LINES 0+5 1k _ __ . IL' +5 74175 QUAD D-FF mfd clk A MEMORY STROBE BUFFER D Q #1 C Q #2 Figure 11-20) A typical intersection at a row and column junction in the register array circuit. C Q D Q #3_ C input from DATA SWITCHES D4Q C Q cr output to TRANSISTOR BASES A MEMORY CLEAR in transistor matrix 72 column select line '8' (typical) + row select line '0' +5 (typical) 1k .. 0 2 4 6 8 10 12 14 1 3 5 7 9 11 13 15 +5 0- +5 0 7400 74154 16 line distr GND +5 0 1 GND 0 +5 ENABLE ADDR 2 line distr ADDR ENABLE 8 4 2 1 DATA L +5 +5 1k +5 lk kL Ik~ 0I 4 8 I 1 2 ENABLE MEMORY (closed=enabled=0 ) ( openudisabledl) COLUMN ADDRESS (closed=0) ( opencl) The register array address decoding circuitry. Figure 11-21) I (2) (3) ROW COLUMN ADDRESS ADDRESS 8 4 2 1 x x xx 1 x 0 x x x x xxxx 0 I 0 0 0 0 0 0 0 0 1 000 1 ENABLE 0 0 0 Figure F T I (1)1 ROW ADDRESS (closed=O) ( open=l) 11-22) ROW - _ - - - - -COLUMNSELECT SELECT 0 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 00 (1) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x x 1 0 01 x x x x xxxxxxxxxx 1 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 x x x x x 0 0 0 0 0 0 0 0 0 0 x x x x x 0 0 0 0 0 0 0 0 0 0 0 0 x x x x x x x x x x x x 0 0 0 1 Register array address circuitry truth table. (2) (3) 73 row and column lines are brought high, and then the memory strobe is activated, producing a brief ( 1 - 10 microsecond ) pulse. Thus the clock input of the desired chip, and this chip only, will see this positive going pulse, accepting the data on its inputs into the registers on the rising edge. The address decoding scheme used is pictured in figure 11-21. The principle is very simple. The binary input on each of the row and column address lines is directed through a data distributor network which sends a high level to only one output line ( 2n output lines for n input lines ), keeping all the others in the low state. The output line is chosen uniquely, depending upon the binary value of the input data. The intersection of these high levels then determines where the memory strobe pulse will appear. The memory circuitry also includes two other features. 'enable' The line must be in a low state for the address decoders to work. If in a high state, all to ground, the row and column select lines will be clamped and the memory strobe will have no effect. Useful for imple- menting protection from a program being altered unwittingly. clear of the complete register array ( resetting all A direct connections to the 'no connection' value ) can be effected by bringing the memory clear line to ground. For normal operation, it must be held in the high state. Programmingthe Array Entering a program into the register array is now a simple procedure. Using the matrix notation for a Petri net structure as developed in section I, we have encoded the sample Petri net shown in figure II-23A into the matrix form in figure II-23B. Note the addition of the row and column addresses on the edges of the matrix form. Applying the conversion 74 chart of figure II-23C, which describes the bit maps for each of the symbols dot, cross, and no connection for each form of place, yields the net program of figure II-23D. The program was constructed by noting that by clearing the array initially, all intersections are set to 'no connection'. We then tabulated each dot and cross in the matrix, along with their row and column address, to form the 'program listing', by substitution of their bit values from the conversion chart. Note that also, from observing the Petri net structure in figure II-23A, that the bits to initialize internal places one and four to the full state, and internal places two and three to the empty state, must be set. Also, the arbiter between transitions one and two must be enabled, and all the others should be disabled ( to take full advantage of the asynchronous structure ). Using this above procedure on any Petri net matrix representation, we can thus generate the 'program' that is entered into the register array to simulate the matrix. 75 1 32 1 4 start6 2 43 25 Transitions T1 and T2 arbited; Internal places 1, 4 initially full 11-23 A) Figure A sample Petri net control structure to be simulated. 0----0---lCOWUP -- o-4 o Co ADDRESS --- - o 0 0 1 ___ 0 10 10 -4 input-outp it places ---0~ ' a 1 A 1 2 2 P 3 3 4 3 34 4 X . 9 ---- ------ 0 .4 0 0 decision lines - C> 4 - internal -- places t 1 f t'2fl 2 34 1 . X 0 X X 4 x" X , 21 4$ 5 X o 6 7 rdd 8 Figure II1-23 B) Matrix notation for Petri-,net structure above, with row and column addresses added* 0 76 type of X place xi start x1 output input x1 Internal% 11 --- NC true false NC x0 10 01 00 x0 10 00 CONVERSION CODE FOR PLACES ---x . don't care ( either 0 or 1 %- code 01 is not used ( illegal Figure column address 0000 0001 0010 0011 0100 0101 0110 0111 1000 1011 1011 1100 1100 1101 111.0 1110 ~ 1 1 1 1 1 0 01 0 1 0 0 1 - --- )# CONVERSION CODE FOR DECISION LIKES -- code 11 is not used ( illegal ) The conversion code from matrix notation to bit notation as used in the memory and register array. 11-23 C) row addr note # X0 data switches 00 00 00 00 00 00 11 00 00 00 10 00 10 00 00 00 11 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 11 00 00 01 00 00 11 00 00 11 00 00 internal place set bits 0 - empty 1-full p1 - 1 P2 - 0 p3 - 0 p4 - 1 01 00 00 11 10 11 11 00 00 00 00 00 11 10 00 10 00 11 00 00 transition arbitration select bits 1 = arbitrate 0 - normal t1-t2 - I t3-t4 - 0 t5-t6 t7-t8 - 0 0 (all other locations are zero) Figure 11-23 D) Final generation of a complete program for the Petri net simulator, using the sample net in figure II-23A. 77 III. The Multiplexed Array Originally it array. was planned to build what' was called a multiplexed In many respects this was similar to the final design, described in section II. It would use a transistor array for the switching matrix, a register arry to store the desired pattern of interconnections, would have a row of places along the top of the array, and a column of transitions along the right hand side. Its use would be much different than as described for the design, however. ray to simulate a larger array. The idea was to use a small ar- The larger array would represent a con- trol structure diagrammed in the Petri net array representation. This array would be large, so the expense of registers would make impractical the simulation of the entire net on a large switching matrix. alternative, however, A cheaper would be to divide the array into equal size sub- sections, and then simulate each subsection in turn on a much smaller array. The cost saving would result from the much smaller register ar- ray required to store the pattern of interconnections for the smaller array. This introduces a number of problems. vide up the larger structure? Second, subsection to another, and when? other problems. tially Although it First, how should one di- how is one to switch from one Each of these introduces a host of was realized after the design was essen- complete that the technique was impractical because of the over- head involved in changing from one subsection to another of the simulated array, the following sections will illustrate the development of the design up to the decision to follow a different tack. 78 A. The Multiplexing Technique Dividing Uph Array Figure III-1 shows three basic ways of subdividing a large array into four equal sections. The simulating array corresponding to each method would then have as many places as are directly above each section, and as many transitions as are directly to the right. The important point to be made from these diagrams is that methods A and B require that Since places are en- the places be divided up for the simulating array. tities that store information, larger array must still larger array. the structure that is used to simulate a have a representation for all the places of the Thus methods A and B will require the full number of pla- ces to be implemented, not just one-fourth or one half the number. subsections in these methods, however, cannot communicate with all the places at once, but can do so with only a fraction. The of Thus methods A and B require some sort of data distribution network or multiplexor so that each subsection communicate with only that subset of places that would lie directly above it Method C, however, in the large matrix. does not require any such multiplexing of the lines from the switching matrix, as each section looks at all Transitions, however, require no multiplexing in any of the methods, since they do not store information, to place. Clearly, the places. but merely transmit it from place then, method C of division is much more advantagious than the other two. This method has other advantages, also. It imposes no constraints on the arrangement of intersections in the large array being simulated 79 places Method A .ansitions plales Method B places Mehtod C r Figure III-I. Dividing up an array. 80 relative to the subsection locations. Since each row is a subsection coresponds to a whole row in the large array, each transition will be connected to all of its input and output places at once, and therefore will not need any information that may be present in another subsection. In addition, places may have input or output transitions in different sections of the large array. This is a direct result of the constraints that only one token may be present in a place at any one time, and that a place having more than one output transition can only fire one of them at a time. Thus a place can have output transitions in different sec- tions of the large array. them at a time, Since it is constrained to fire only one of the separation can cause no hazards. In fact, a place to transition arrangement that requires arbitration need only have the two or &ore transitions separated into different sections. The arbitra- tion being carried out as the place can only fire at a given time those transitions in the same section. This arbitration is biased, but is fault-free. Again, since a place can have only one token in it at a time, only one of its input transitions can fire at a time. Therefore its input transitions also may be in different sections of the simulated array without any difficulties. Figure III-2A shows a typical array pattern, assuming that the simulated array is sixteen places by sixteen transitions. If one is going to multiplex this array on a sixteen place by four transition array, it will be divided into the four sections shown in figure III-23, each of which will have its pattern sequentially imposed on the simulating array. 81 Places i I I UiI i 11 Xeex * 00 x x Figure III-2A. IIL1 I T XX) e . Representative array pattern. i If Transitions 82 I F Figure III-2B. Division into four subsections using method C. 83 Achievement of Multiplexing There remains the question of how and when to switch from simulating one subsection to another. First, one could allow each section simulation to run when to do so. until it Two criteria can be used to decide transitions can fire and has reached a point where none of its no external links are active. state, and in this state all This state shall be called the blocked action has ceased. sible because no transitions can fire to fill can be filled No more action is pos- new places and no places on the completion of some external operation. is not ideal because it This method is easy to construct an array subsection that will never become blocked, and thus would never allow switching to another subsection. A second method would be to allow each section to run for some specified time period, and then switch to the next section. In this method one could never get hung up in a particular subsection, does have its disadvantage. but it Using this "time-slicing" technique, a sec- tion could become blocked early in the time slice, and thus remain inactive for the remainder of the time period. This idle time is complete- ly wasted. The ideal solution is to use a combination of the two techniques. The array would be designed (as it was, in fact, done) to switch from one section to another either after a section has run for a specified time period or after it has become blocked, whichever occurs earliest. 84 B. Implementing the Array The Switching Matrix The switching matrix used is exactly similar to the one described in section II for the final version. in the arrangement of vertical wires. There are only some slight changes The interconnection technique and therefore the signalling conventions are themselves exactly the same. Its design, however, had not been finallized when the decision was made to change the overall design, so no further description is necessary. The Array Control Circuitry The control circuitry must accomplish several tasks. It must time the running of each subsection according to the time-slicing and at scheme, the end of the time slice must stop the running of the array (because according to the constraints the change can only be done when no transitions are firing) and cause the simulator to change from one subsection to another. It must also continuously test for the blocked state, and when it detects this state before the time slice is completed must cause the array simulator to switch to the next subsection. The actual switching from one section to another is done by a separate memory controller that changes the contents of the register array that stores the pattern of interconnections for the current subsection. The array control provides the memory controller with a number indicating which subsection to next simulate. (ready-acknowledge) A standard control link links the two control units. The array control sends a ready on the link when the subsection is to be changed, and receives an acknowledge when the switch is complete. The memory controller 85 is described in section III-C. The array controller, then, consists of a timing circuit that is started at the beginning of each time slice, a block tester (to be des- cribed later) whose output is high when the simulater is blocked, a counter that outputs the number of the next subsection to the memory controller, a blocking output that goes high when the time-slice is complete, tions, in order to block the array in preparation for switching sec- and the necessary switching logic to coordinate these functions. Figure III-3 shows an implementation of this control circuitry. The switch starts the system after initialization, which on being pressed toggles flip-flop 1, sending a ready to the memory controller, (since the counter state is 0000) the first subsection. which sets the register array to represent The block output is the exclusive-or of the acknow- ledge line and the Q,output of flip-flop 2, and thus is initially high. This is because during the array switching it is necessary that the transitions be disabled. When the acknowledge returns the level change causes the block output to go low, enabling all allowing the array to proceed. It transitions, and thus also produces a pulse that triggers one-shot number 1, whose external timing components are chosen to produce a pulse whose width equals the desired length of the time slice. This pulse also clocks the counter to indicate the next subsection. One of two things can then occur. could go high, The output of the block tester indicating that the array is blocked, or the Q output of one-shot number 1 could go low, indicating the end of the time slice. If the former occurs, a low pulse appears at the reset input of one-shot number 1, resetting it. The one-shot's Q output goes low, toggling 5 +5 o tialize i 1K ohms _P ready JQ (to memory control) Ck 1 NO K 25ns acknowl!dge initial one-shot-width set as desired. T R +5 Q 2 ---- hexadecimal counter Q2 1 Qq QO Ck R R (falling edge trigger) e k C* section selecti to memo,{) control) in initialize 11 Figure II.5* multiplexed a rray control circu itry. (to block. ranitions) block est in t b bocing block tester arbiters) block test - - -romplaces) ( b retriggerable one-shot, rising edge trigger; width=250 nsinitialize 30ns)--- test .lock Q 2nputc tfrom transition x inp-ts) I 87 flip-flop 2, causing the block output to go high, which will disable the transitions during section switching. The low to high output change of the block tester b output also causes a pulse to toggle flip-flop 1, sending a ready to the memory controller, which will then switch the register array to the next subsection. If the Q output of one-shot number I falls before the array becomes blocked, that level change toggles flip-flop 2, sending the block indicating that all When block tester output a goes high, output high. transitions are disabled, a pulse toggles flip-flop 1, sending a ready to the memory controller. Receipt of the acknowledge starts the whole cycle again. The Block Tester Since TTL logic has finite delays, the signal from a place indicating that it is full after receipt af an acknowledge on its control link takes a finite time to propagate to the input of a transition. This time could be between 10 and 20 nanoseconds. Thus for a period of time all control links could be inactive and all transitions inactive while such a signal propagates to the x input of some transition. This is not the only situation in which the array could appear blocked for a short period. transitions, If one allows the use of arbited pairs of both transitions could be enabled but not firing as the arbiter takes some arbitrary time to decide which is to fire. Thus one can never be sure within a finite time bound (see Patil 7 ) whether a transition is going to fire by checking its s output. must test the x input, before the arbiter, to see if Therefore one a transition is 88 enabled or firing. Again, the x line input to a transition may though, be low for some time before the z line goes high (depending on the emptying speed of its be inactive because the rising edge on the z line, which puts tokens in- The maximum amount of time this state to places, has not yet occurred. may exist is about 100 to 150 nanoseconds. Thus, b output is high only if is designed so that its control links could At this time all input places). if the block tester both the x lines and least 250 nanoseconds continuously, it control links are inactive for at will indicate the existence of the blocked state. The one-shot enclosed in the block tester in figure 111-3 performs such a test. of the and gate that its The output Q output feeds will be high only if the one- shot's trigger input has been high for at least 250 nanoseconds, trigger input will be high only if all and the transition x lines and all place control links are inactive. Block tester output a is high only if been blocked by the control output "block". end of a time slice, when a is high it all the transitions have Since this indicates the resets output b, to ensure against spurious transitions at the b output. Places The places are very similar to the places used in Professor Patil's original hardwired version. The signalling from place to tran- sition is the same as that for the internal places in the final version from section II. Put line, (it Some additions have been made, the d line, is high if however. Another out- a token has been put into the place goes high when the control link becomes active), and goes low when 89 the token is removed. lines of its It is connected in the switching matrix to the x input transition(s). prevents the input transi- Thus it tion from firing until the place is empty. A decision line has been added to each place, also, so that each one in the set of places is capable of being used for decisions, and no places will be wasted, as in the original version. The last addition made is an exclusive-or tester of the control link that is high when the link is inactive, low when it This signal is one of the a inputs to the block tester. is active. The matrix connections for the a, b and c lines of each place are exactly as for an internal place of the final design in section II. a schematic for each place, Figure 111-4 is and figure 111-5 illustrates the timing relationships. Transitions Basically, section II. the simple transition is like the one presented in However, another input, the block input, has been added. This input is to disable the transition, and thus prevent it ing, from- fir- inorder to allow the switching from one subsection to another. However, one can not stop the transition in the middle of firing, so therefore an arbiter must be used on the z output. goes high before the z line goes low, If the block input the transition will block, and the block input will be passed through the arbiter and returned to the b inputs of the block tester, indicating that that transition is disabled. Since it is desired that each transition be in the quiescent state throughout the switching process itself, the block output from the ar- 90 acknowledge ready T1 decision ty tester (input a) Q Q R J _ Ck K I I A5 +5 H K Ck J T I; 4 initialize 330 U y_ ohms R J Ck R K K Ck J initia ize P a (from matrix) normally 1 normally: b 1 c dT (all to matrix) 0 0 Figure 111-4. The place for the multiplexed array. F T 91 a ready [40n acknowl edge 10ns I ready acknow Ledge -v On or I 45ns b c d -v20ns Figure 111-5. ' 20ns Timing for a typical place in the multiplexed array. 92 biter is used to force the x line low and the y line high. This assures that each transition is ready to be fired from the quiescent state by the time the switching from section to section is complete. The x line, after inversion, is sampled at the c inputs of the block tester. Be- cause of the inversion these inputs will be high if the x line is low. Figure III-6 shows the circuit for the simple transition. (For a dis- cussion on construction of an arbiter suitable for use in this circuit, see Patil 7 .) to make. The extension to an arbited pair of transitions is simple The blocking arbiters may be placed anywhere on the z line out- put before it enters the matrix, and the forcing gates should be placed before the x line input to the transition arbiter. 93 block (from main control) Y x R - B I (from matrix) - - Y (from matrix) T+5 Ns- (to matrix) t to block tester input c) Figure 111-6. Simple transition for the multiplexed array. Sto block ester input b) 94 III-C The Memory Controller Module The previous section dealt with the design and implementation of the control circuitry required in the multiplexed array simulator. This section concerns the operation of the memory control module, which is responsible for controlling the loading of the register array from the bulk memory. Requirements The memory controller module will interface the bulk memory, in which the structure of the entire Petri net matrix will be stored, to the register array, which can contain a single subsection of this memory at a time. Upon receipt of a ready command from the multiplex control circuitry, and the specification of which subsection to load, the module will transfer the contents of the subsection, as stored in the bulk memory, to the register array, thus setting the switching matrix transistors. When the transfer is finished, the module will return an acknowledge signal, indicating to the control circuitry the register has been loaded, and to begin simulation of the newly loaded subsection. Organization The bulk memory, as well as the register array, will be organized on the basis of a four bit word. This choice results from the fact that at each place - transition intersection in the multiplexed array, there are four separate groups of switching transistors to be controlled individually to form the required intersections. So that no decoding circuit is required on the output of each register, a four bit representation was chosen. This, as it happens, also fits cleanly on a four bit register 95 single TTL chip. The physical layout of the register array and switching matrix is nearly identical to that designed for the programmable array. Only the dimensions have changed: PROGRAMMABLE VERSION MULTIPLEXED VERSION (1) 2 bits/word (1) 4 bits/word (2) 8 places by 8 transitions (2) 16 places by 4 transitions (3) lead multiple intersections into register array at a time (3) load only one intersection into the register array at a time For simplictty in address decoding, it will also be beneficial to organize the sizes of the subsections, bulk memory in multiples of two. the register array, and the In this manner addresses can be coded and decoded using binary subfields, and no oddball arithmetic need be performed in conversion. For example, given our multiplexed register array is organized as a sixteen column ( places ) by four row ( transitions ) system, then four bits are needed for the column address, and two bits for the row address. If, as in our multiplexed array example, we desire to have a maximum of sixteen transitions ( rows ) in the entire net, thus 16 / 4 = 4 binary address. subsections are needed. This requires a two bit Then we can organize addresses in the system using this coding scheme detailed on the top of the next page. Using this scheme, simple binary counter circuits can be constructed for the subsection, row, and column counters ( addresses ), and their outputs can be merged as bit subfields to form the required bulk memory and register array addresses. 0 96 "4 BULK MEMORY ADDRESS = REGISTER ARRAY ADDRESS = X. XXXX XX = 256 x 4 bit bulk memory XXXX XX = ( 16 x 4 ) x 4 bit register array Figure 111-7 depicts a block diagram of the entire memory control system, and shows its interfaces to the bulk memory, register array and switching matrix, and the multiplexing control circuitry. The bulk memory module can be any device capable of acting as a memory. Any type of TTL compatible memory device can be employed ( ROM, RAM, PROM, etc. ) to act as a storage medium ( for RAM, it will be assumed that the matrix has been previously loaded by some means ). The bulk memory module will then, when handed an address and a ready signal, return the data word at that location, and an acknowledge signal that it has finished. Operation Figure 111-8 depicts the internals of the memory control module's control circuitry. Completely asynchronous, it functions as follows. On receipt of a ready command from the control circuitry ( rc ), reset the row and column address counters and the overflow bit to zero, and raise the register array enable line from ground, so that its internal address decoding circuitry is activated; when these operations are complete, begin the loading cycle. The loading cycle consists of: (1) send a ready(rm) signal to the bulk memory, fequesting a data word ( the address has already been set up ), (2) when the memory acknowledge is received ( am ), the data is available from the memory, (3) send a strobe pulse to the register array (strobe), entering the data word at the current row/column position, 97 (4) clock the row/column counters to the next address ( possibly setting the overflow bit, on the last cycle ). This sequence of operations repeated until the overflow bit is set ( ovf ), is indicating the entire subsection has been transferred from bulk memory to register array. At this point the enable line ( enable ) to the register array is brought low, disabling its address circuitry; and finally an acknowledge ( ac ) is returned to the control circuitry, transfer operation. figure 111-9. indicating the completion of the The actual timing relationships are detailed in Figure III-10 indicates the data flow and address decoding required to access the words in the bulk memory, and transfer them to the correct location in the register array. Figure 111-7) Block diagram of the Memory Control System for the Multiplexed Array TRANSISTOR SWITCHING MATRIX 00 DATA RO DDR MEMORY CONTROL MODULE BULK ADDRESS MEMORY REGISTER COL ADDR ENABLE R STROBE A I >tw t(n to control circuitry N ARRAY 99 strobe L PULSE GEN. DEIAY rm ns - cd ovf init ovf Q J R KL init COLUMN ROW ADDRESS ci . row ctr R co st ADDRESS ci col ctr R coR i P I .. rpl 0 z J1K Q - K Q enable U, Figure 111-8) ac r0 Memory Control Module internal control circuitry rc U, rdy 100ns ac rpl Ji1ons ret TL1 I lack I'--- I g I '~71 77- enable rovf set' ovf I, 1 rm II am Vil~ lOOns I g I I I I '~ I I S strobe I - iF I I I II I lOOns I I I 1 M I I I El I I' I I Iv I Pt- of In- Q 8~q* o 4 ~ II ~ *0 I1~1 rt Vt I-I A 8 column 4 address 2 1 row address 2 1 Figure 111-9) 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 1 Memory Control Circuitry ( of figure 111-8 ) Timing Diagram 0 0 0 0 0 0 0t 0 data input data( output row address memory( address ) 0 column address 0 H C, 1248 12 row ctr column ctr Control Circuit of Figure ready rm am acknowledge a S enable enable strobe strobe a "4 P1S tb0 0 Figure III-10) I~' H 0 H 111-8 '1 0 th' CONTROL CIRCUITRY INTERFACE 0 Memory Control Module Internal Data Flow. 102 In Conclusion As mentioned in the introduction, a combination of micropro- gramming techniques with asynchronous control structures would provide the ideal blend of high speed with processing flexibility that would be very attractive for ultra high speed computing systems. ginal diode structure8 comes close to this ideal. Patil's ori- The system described in section II of this thesis realizes this type of structure completely. Any program (Petri net structure) may be simulated on the array merely by changing the contents of the register array. easily be done from any sort of memory device. This could Parts of the array could even be reprogrammed during running of the system. In this way one could implement intermittent program stops for machine checking, and other functions that would require changes from the normal operation. This array, at the time of submission of this thesis, was nearing completion at the Computation Structures Laboratory of Project MAC. It will be used for the demonstration and illustration of the techniques of asynchronous computation and the concept of logic arrays as control structures. Perhaps even more powerful than the use of totally reprogrammable arrays like the one presented would be the combination of Patil's original non-reprogrammable diode array with the reprogrammable array. This could save considerable expense, as the diode array could be cheap- ly mask-programmed on a semiconductor chip, and thus the large number of registers required to hold the pattern of interconnections for a large array would not be needed. It would still retain flexibility for 103 for run time reprogramming in the section still reprogrammable. Changes in the total structure would be implemented simply by changing the mask for the diode array at manufacture. 104 REFERENCES (1) Baker, H., "Petri Nets and Languages", Computation Structures Group Memo No. 68, Project MAC, Massachusetts Instute of Technology, Cambridge, MA, May 1972. (2) Dennis, J. B. and Patil, Suhas S., "Computation Structures", Department of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA, notes for subject 6.032, 1973. (3) Dollhoff, T. L., "Microprogrammed Control for Small Computers", Computer Design , May 1973, pp. 91-97. (4) Holt, Anatol W., "Introduction to Occurrence Systems", Associative Information Techniques, edited by E. L. Jacks, Amer- ican Elsevier Publishing Co., 1971,pp. 175-183. (5) Jump, J. R.,"Asynchronous Control Arrays", IEEE Transactions on Computers, October 1974, volume c-23, number 10. (6) Patil, Suhas S., "Macro-modular Circuit Design", Computation Structures Group Memo No. 40, Project MAC, Massachusetts Institute of Technology, Cambridge, MA, MAY 1969. (7) Patil, Suhas S., "Bounded and Unbounded Delay Synchronizers and Arbiters", Computation Structures Group Memo No. 103, Project MACMassachusetts Institute of Technology, Cambridge, MA, June 1974. (8) Patil, Suhas S., "An Asynchronous Logic Array", Computation Structures Group Memo No. 111-1, Project MAC, Massachusetts Institute of Technology, Cambridge, MA, February 1975. (9) Su, Stephen Y. H., "A Survey of Computer Hardware Description Languages in the U.S.A.", Computer, December 1974, volume 7, number 12, pp. 45-51.