Proceedings of International Joint Conference on Neural Networks, Montreal, Canada, July 31 - August 4, 2005 Modification of the ART-1 Architecture Based on Category Theoretic Design Principles Michael J. Healy†, Richard D. Olinger† Robert J. Young‡, Thomas P. Caudell†‡ †Department of Electrical and Computer Engineering University of New Mexico Albuquerque, New Mexico 87131 E-mails: mjhealy@ece.unm.edu, rolinger@ece.unm.edu ‡Department of Computer Science University of New Mexico Albuquerque, New Mexico 87131 E-mails: ryoung@cs.unm.edu, tpc@ece.unm.edu Abstract— Many studies have addressed the knowledge representation capability of neural networks. A recently-developed mathematical semantic theory explains the relationship between knowledge and its representation in connectionist systems. The theory yields design principles for neural networks whose behavioral repertoire expresses any desired capability that can be expressed logically. In this paper, we show how the design principle of limit formation can be applied to modify the ART1 architecture, yielding a discrimination capability that goes beyond vigilance. Simulations of this new design illustrate the increased discrimination ability it provides for multi-spectral image analysis. I. I NTRODUCTION Many studies (see for example [2], [5], [10], [9], [16], and the review [1]) have addressed the knowledge representation capability of neural networks. We present an example to illustrate the improved performance achievable by applying neural network design principles derived from a recentlydeveloped theory for knowledge representation. The example is a small but significant modification to an ART-1 network[3], applied to multi-spectral image analysis. The knowledge representation theory is based upon the mathematical rigor of category theory applied to neural network semantic modeling. Because category theory is as yet unfamiliar to many (until recently being regarded as the ultimate in pure mathematics), we begin with a brief overview of the topics necessary for understanding the work described here. Our semantic theory in its current state of development is described in full in [14], which contains a more comprehensive overview of category theory. Our previous work in applying the semantic theory to neural network analysis and design is described in [11], [12], [13]. Many applications of category theory exist, both to physical and computational theory ([7], [8], [19], [21], [22]) and to practice ([15], [23]). The semantic theory addresses the question of where and how knowledge is acquired, organized, and stored in connectionist systems. The knowledge has the structure of a hierarchical system of concepts, directed from the abstract to the specific. The theory explains knowledge acquisition and representation in a neural network as an incremental reuse of existing concept representations combined with data to 0-7803-9048-2/05/$20.00 ©2005 IEEE 457 Kurt W. Larson§ §Sandia National Laboratory Albuquerque, New Mexico E-mail: kwlarso@sandia.gov form new representations of both more abstract (or general) concepts and more specific (or specialized) ones. Applied to a neural network with many sensors and other functional sub-networks, the semantic model provides a mathematically rigorous yet natural explanation of the combining of networkregion-specific hierarchy representations so that the overall network, if well-designed, acts as if there were a single knowledge structure guiding its behavior. Here, we focus upon the incremental knowledge representation in an ART1 network, which has only a single region, associated with a single input layer. What we show is that the theory can be applied to improve the performance of even a single-region network in processing multi-modal information derived from a single sensor. The paper is organized as follows. Section II provides a very brief grounding in the category theory used. In Section III we show how our categorically-based semantic theory is applied to neural networks. In Section IV we describe the use of the theory in obtaining the ART-1 modification for our example. Section V describes our experimental method, Section VI the results, and Section VII is the Conclusion. II. C ATEGORY T HEORY: A B RIEF I NTRODUCTION Category theory (see [20], [6], [17], [18], or the tutorial in [14]) is based upon the notion of an arrow, or morphism—a relationship between two objects in a category. A morphism f : a −→ b has a domain object a and a codomain object b , and serves as a sort of directed relationship between a and b . In a category C , each pair of arrows f : a −→ b and g : b −→ c (where the codomain b of f is also the domain of g as indicated) has a composition arrow g ◦ f : a −→ c whose domain a is the domain of f and whose codomain c is the codomain of g . Composition is associative, that is, for three arrows of the form f : a −→ b , g: b −→ c and h: c −→ d , the result of composing them is order-independent, with h◦(g ◦f ) = (h◦g)◦f . For each object a , there is an identity morphism ida : a −→ a such that for any arrows f : a −→ b and g: b −→ a , ida ◦ g = g and f ◦ ida = f . A familiar example of a category is one called Set, which has sets as its between categories that preserves compositional structure, called a functor, formalizes this notion. A functor F : C −→ D associates to each object a of C a unique image object F (a) of D and to each morphism f : a −→ b of C a unique morphism F (f ) : F (a) −→ F (b) of D , and is such that (1) For each composition g ◦C f in C , F (g ◦C f ) = F (g) ◦D F (f ) , where ◦C and ◦D denote the respective compositions in C and D ; (2) for each object a of C , F (ida ) = idF (a) . It follows that F maps commutative diagrams of C to commutative diagrams in D . This means that any structural constraints expressed in C are translated into D . _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ a r9 O 3 eLLLL r r LL f2 f1 rrr LL r LL rr L r LL r rr a1Y3 a E 2 ∆ 3 3 3 g3 3 3 g1 3 g2 3 K 3 3 3 3 b ∆ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Fig. 1. III. A PPLYING C ATEGORY T HEORY TO N EURAL N ETWORK S EMANTIC A NALYSIS A limit for a diagram ∆ . objects, functions as its morphisms, and whose composition is just the composition of functions, (g ◦ f )(x) = g(f (x)) . Key notions for the theoretical background of this paper are commutative diagrams and initial and terminal objects. A diagram is a collection of objects and morphisms of C . In a commutative diagram, any two morphisms with the same domain and codomain, where at least one of the morphisms is the composition of two or more diagram morphisms, are equal. An initial object, where one exists in C , is an object i for which every object a of C is the codomain of a unique morphism f : i −→ a . A terminal object t has every object a of C as the domain of a unique morphism f : a −→ t . An important use of these key notions is in the definition of limits and colimits. In [11] and [14] we have shown how colimits model the learning of more complex concepts through re-use of simpler concepts already represented in the connection-weight memory of a neural network. In [14] we show how limits model the learning of simpler, more abstract concepts through re-use of existing representations. Let ∆ be a diagram in a category C as shown in Fig. 1, with objects a1 , a2 , a3 and morphisms f1 : a1 −→ a3 and f2 : a2 −→ a3 . The diagram ∆ extends ∆ to a commutative diagram with an apical object b and morphisms gi : b −→ ai (i = 1, . . . , 3) , with f1 ◦ g1 = g3 = f2 ◦ g2 , provided additional objects and morphisms with the requisite properties exist in C . The conical structure K is called a cone. Cones for ∆ are the objects of a category cone∆ (whose morphisms are described in [14]). A limit for the diagram ∆ is a terminal object K in cone∆ . The importance of category theory lies in its ability to formalize the notion that things that differ in substance can have an underlying similarity of “structural” form. A mapping 458 Knowledge can be seen as a system of symbolic concepts - descriptions of objects, events, and anything else one can imagine, at any arbitrary level of generality or specificity. The system is organized as a hierarchy ordered from the abstract to the specific. Learning, the acquisition of a knowledge representation in a neural network, begins at the sensor level of processing. Concepts associated with sensor elements describe the sensor primitives. These are far from being the most complex, yet are not the simplest, concepts possible. Indeed, the neural network’s learning algorithm effectively re-uses the sensor concepts in many ways in combination with the input data to form concepts not yet represented by the connectionweight array of the network. The more complex concept representations are formed via colimit generation in the network structure. (Obviously, this implies that a category can be used to represent the diagrams and colimits; neural categories will be discussed briefly here.) An abstraction process proceeds in the other direction via limit generation. The abstract concepts describe items that are shared by the more complex concepts in the diagrams over which limits are formed. Thus, the knowledge-representation process proceeds in both directions — specialization and abstraction — beginning at the sensorpercept level. A category Concept provides the required mathematical model for the hierarchical structure of knowledge. In actuality, this is a category whose objects are formal logic theories T and whose morphisms are theory morphisms s: T −→ T 0 . Briefly, s is a mapping of the quantities and axioms expressed in T into the theory T 0 such that the images of the axioms of T are either axioms or theorems of T 0 (see [14] or any of [6], [8], [19]). Categories NA,w , where A is a neural network architecture (such as a specific ART-1 network) and w is an array of connection weight values for it, provide the required mathematical model for neural networks in specified states of learning. The objects of NA,w are the sets of inputs that “activate” pairs (pi , η) given the weights in w , where pi is a node of A and η is a set of output values for pi . The set η is often modeled as an interval of real values where pi has a real-valued signal function. A member of the activating set for (pi , η) is an input pattern that causes pi to generate an output signal in the set η . A morphism m: (pi , η) −→ (pj , η 0 ) is the set of inputs that cause all the nodes lying along the paths of connections forming a bundle Γ to generate outputs within specified intervals. The paths in Γ share the common source and target objects (pi , η) and (pj , η 0 ) . If A is properly designed and w is an array of weight values acquired at some stage of learning from input patterns, it will be possible to define a functor M : Concept −→ NA,w . This is a mathematical description of the representation of concepts and their morphisms in A at the stage of learning represented by w . Each concept morphism s: T −→ T 0 has an associated model-space morphism, a functor Mod(s): Mod(T 0 ) −→ Mod(T ) . Here, Mod(T ) and Mod(T 0 ) are categories of models, possible worlds or instances within which T and T 0 hold, respectively. Since Mod(s) reverses the direction of s , each instance of T 0 has a corresponding instance of T . This fact has great significance for neural networks. To see this, suppose that (pi , η) and (pj , η 0 ) are the images of objects T and T 0 under the functor M , (pi , η) = M (T ) and (pj , η 0 ) = M (T 0 ) , and that m: (pi , η) −→ (pj , η 0 ) is the image of s: T −→ T 0 , m = M (s) . We associate the activating inputs for the objects (pi , η) = M (T ) and (pj , η 0 ) = M (T 0 ) with objects in the model categories Mod(T ) and Mod(T 0 ) , respectively. Given this association, every input that activates (pj , η 0 ) must also activate (pi , η) , a consequence of the existence of the model-space morphism Mod(s): Mod(T 0 ) −→ Mod(T ) . Let T be the apical concept of a limit cone for a diagram ∆ in Concept and let s: T −→ T 0 be one of the leg morphisms for the limit cone. Then, (pi , η) must be activated whenever (pj , η 0 ) is, where (pj , η 0 ) can be any object in the diagram image M (∆) . IV. A N ART-1 N ETWORK M ODIFIED WITH L IMITS AT F1 In the following, we apply limits to supplement the ART1 vigilance mechanism. This enhances the resolving power of ART-1, allowing it to control information loss in specific regions of the templates as they form. Use of the vigilance mechanism alone allows control only over information loss in a whole template. Applying limits requires that we discuss ART-1 as an architecture that can be extended to have a categorical representation capable of containing the image of a functor from the Concept category. This is not a trivial task with any existing artificial neural architecture, but this need not prevent us from testing a partial categorically-based extension. Accordingly, we have modified an ART-1 network by applying limits to discrete diagrams (i.e., having objects only) comprising disjoint subsets of nodes whose union is the entire F1 layer. Each node is associated with an object whose output set η includes all positive outputs. This allows the apical objects of the limit cones to be shown simply as nodes (see Fig. 2, where the apical objects are labelled “ SMi limit”, SM standing for “sub-modality”). These form a new layer which we call F−1 . The limit cone leg morphisms are represented by bundles consisting of a single connection each, projecting from a limit node SMi to each of the F1 nodes representing an object in its diagram. Each feedforward 459 Fig. 2. A resonating template in an ART-1 network modified to extract abstract concepts corresponding to sub-modalities in each input pattern. The F1 nodes represent input concepts and the SM nodes represent the extracted sub-modality concepts. connection is paired with a feedback connection. If the neural cones are the functor images of limit cones in the category Concept , the reciprocal feedback connections represent the model-space morphisms corresponding to each of the limit cone leg morphisms. The feedforward connections have small positive weights and their reciprocal connections have unit positive weight, so that activity in SMi has a minimal impact upon its F1 nodes while they, in contrast, provide excitatory input to it proportional to the number of them that are active. Letting the set of all nonzero outputs from each F−1 node represent the limit object of that node ensures the enforcement of the property that a concept morphism is accompanied by model-space morphisms: That is, if any one of its F1 nodes is active, a node SMi will be active. Thus, the F1 nodes represent neural category objects which in turn represent concepts specifying sensor input properties, while the subset constituting the diagram for an SMi limit node represents a property comprising a group of input properties. Each SMi node in F−1 represents an aspect of the group property shared by the F1 nodes in its diagram. The basic idea in making use of the F−1 nodes was to have them supplement the vigilance node’s F2 reset capability. This is the purpose of the connections from the F−1 nodes to node V via node S as shown in Fig. 2. If resonance between the current input and a template pattern is about to occur, but one of the F−1 nodes is inactive because none of its F1 correspondents is active, the resulting lack of an inhibitory signal to S can allow its (tonic) activity to activate V , thereby effectively vetoing the resonance. In this way, each sub-modality is required to maintain at least one binary 1 in each template. A further idea is to require an arbitrary number of binary 1s for each sub-modality in each template by having an adjustable but uniform threshold value t for the F−1 nodes. Here, all of the m sub-modality regions F1,i of F1 (i = 1, . . . , m) have the same number of nodes s , so that n = ms where n is the number of nodes in F1 (hence, also in F0 ). The bit-wise “AND” I ∧ T k of the current input pattern I and choice template T k consists of sub-patterns Ii ∧ Tik . Let k X k be the number of 1-bits in a binary pattern X . To avoid activating V , then, each sub-modality i must satisfy the inequality k Ii ∧ Tik k ≥ ts/2 (the factor 1/2 is based upon our use of complement-coded input patterns [10]). This requirement allows the user of the network to exercise a more specific control over template information loss during recoding than is allowed by having a vigilance parameter alone, which requires only that k I ∧ T k k ≥ ρk I k , where ρ is the usual vigilance parameter. Just as ρ can be applied to control the amount of specialization versus generalization allowed in the templates (a higher value means fewer input exemplars per template, hence, greater specialization), t can be applied to control the specialization versus generalization allowed in each sub-modality region of the templates (a higher value means greater specialization within the sub-modality). Given that an F1 activity pattern I ∧ T k is made up of the activity patterns Ii ∧ Tik , and any F−1 node can activate V if its activity falls below its threshold t , it is natural to ask if t can be used to eliminate ρ altogether. It can be shown that the usual test involving ρ is indeed redundant if t ≥ ρ . However, this is not the case when t < ρ , and therefore the parameter ρ cannot be eliminated. V. T HE E XPERIMENTAL M ETHOD A multi-spectral image was given as a set of 10 optical band amplitudes for each pixel ( m = 10 ). This was to be used to produce a false color image. Our method for this was as follows. First, the 10-dimensional vector of analog values for each pixel was converted to a binary input pattern for an ART-1 network by converting the values to complement-coded stack numerals; each stack code consists of a 0-1 binary array which is activated in contiguous fashion, with the number of binary 1s representing an amplitude (this is known widely as “thermometer code”), together with an array with the same number of binary values representing the complement of the first array (see [10], where ART-1 with this representation was proven equivalent to fuzzy ART.). If there are N bits for the “positive” stack representing the amplitude, then there are s = 2N bits in the complement-coded stack numeral and, hence, ms = 10 x 2N = 20N bits in the resulting input pattern for ART-1 (hence, 20N input F0 nodes and 20N F1 nodes). An ART-1 network sorts the input patterns so formed into clusters so that the templates can be decoded into hyperbox regions in 10-dimensional space. The hyperboxes all lie within the 10cube defined by lower and upper bounds on the variation in band values. The color code for a pixel was then selected by first assigning a color code to the template with which it was associated following training on all pixels, and then using that color for the pixel. The color codes for the templates were 460 Fig. 3. Two-dimensional hyperboxes generated by ART-1 with submodality limits: vigilance = 0.47, F−1 threshold = 0.32 . assigned by first sorting the templates in decreasing order of the number of pixels with which they were associated, and then assigning colors starting with blue and proceeding through the visible spectrum to red and, for templates associated with fewer than 10 pixels, white (therefore, the color white can be associated with more than one template). Two versions of ART-1 were used in the experiment, the modified ART-1 network with an F−1 layer as described and an unmodified ART-1 network. For the modified network, the spectral bands were the sub-modalities, with each F−1 node serving as a limit node for the set of 2N F1 nodes representing its complement-coded band value. An activation threshold was used for the F−1 nodes, allowing control over information loss in all bands at an arbitrary, uniform level by the user. All ART-1 simulations for this experiment were performed with our recently-developed network specification and simulation package eLoom[4]. VI. R ESULTS To illustrate the formation of hyperbox templates with the modified ART-1 network, a simpler, two-dimensional example was processed first at several combinations of vigilance and F−1 threshold values (see Figs. 3 and 4). A data file of 500 random 2D points was created in Matlab using the rand function with lower and upper bounds on x and y component variation of 0.0 and 128.0. These points were then preprocessed in Matlab to generate complement-coded binary input vectors based upon an N = 32-bit “positive” stack for each dimension, resulting in 2 x 2N = 4N = 128 bits per input pattern to ART-1. The resulting hyperboxes are shown along with the points in each in Figs. 3 and 4 for a vigilance level of 0.47 and thresholds of 0.32 and 0.40, respectively. A single multi-spectral image was used in the 10-band image experiment. Each “positive” binary stack had N = 8 Fig. 4. Two-dimensional hyperboxes generated by ART-1 with submodality limits: vigilance = 0.47, F−1 threshold = 0.40 . Fig. 6. False color image generated by ART-1 modified with band limits at F−1 , vigilance = 0.55 , F−1 threshold = 0.55 . Fig. 5. False color image generated by unmodified ART-1, vigilance = 0.55 . Fig. 7. False color image generated by unmodified ART-1, vigilance = 0.795 , same number of template colors as in Fig. 6. bits, yielding a binary input pattern for ART-1 for each pixel having 20N = 160 bits. The modified ART-1 network was trained several times and the resulting template color codes were used to form false color images. Several combinations of values for vigilance and F−1 threshold were used, to produce a variety of false color images from which the best could be selected by human visualization. The same process but without threshold values was used with the unmodified ART-1 network for comparison. A “best” false color image (having greatest discernible resolution) occurred for the modified network at vigilance values of zero to 0.55 and an F−1 threshold of 0.55. It was based upon 452 templates, essentially produced by F−1 -threshold resets. For comparison, the image generated with the unmodified ART-1 network at a vigilance value of 0.55 is shown in Fig. 5 and the image generated with the modified ART-1 network is shown in Fig. 6. The latter is clearly superior in resolution; the former was produced with far fewer templates. To obtain an even-handed comparison between modified and unmodified networks, successively higher vigilance values were tried with the unmodified network to approximate the number of templates yielded by the modified network as shown in Fig. 6. A “best” image generated with the unmodified ART-1 network, which generated 399 templates at a vigilance value of 0.795, is shown in Fig. 7. This 461 and higher vigilance values, generating even more templates, produced roughly the same false color image quality. The ART-1 network modified with the thresholded band limit nodes at F−1 yielded a superior image over the unmodified ART-1 network. Finally, notice the color bar legend, labelled with positive integers, at the far right of each image in figs. 5 - 7. With the exception of white, each color is associated with a single template and the number of pixel input patterns associated with it is shown. All templates associated with 10 or fewer pixels are colored white in Fig. 6. To achieve a fair comparison, a higher breaking point for templates associated with fewer pixels was used in Fig. 7; again, all such templates are colored white. This has the effect of producing a color bar equivalent to that for Fig. 6, with both having 32 colors and therefore an equivalent color code. Colors for each figure are assigned to the templates in the order of decreasing number of associated pixels, going from blue to red and then white, where the total number of pixels for all templates with fewer pixels (i.e., white templates) is shown. VII. D ISCUSSION AND C ONCLUSION The objective of this paper was to illustrate the potential in designing neural networks or improving upon existing designs by applying a mathematical semantic model for neural networks based upon category theory. The categorical constructs of the semantic model determine neural representations of knowledge structures involving concepts and their relationships, or morphisms. This puts constraints on architectural design and operational properties. The result of the experiment discussed here illustrates the potential in applying these constraints. Through a relatively simple modification, the vigilance capability of an ART-1 network has been supplemented to provide increased discrimination in clustering. Limits were provided for discrete diagrams in the F1 layer, producing the layer we refer to as F−1 . In the context of concept representation, this layer performs an abstraction process by representing subconcepts shared by the concepts represented in the diagrams at F1 . In the experiment, the concepts are aspects of multispectral image data. We have shown, by visual inspection of the results, that modifying an ART-1 network to include limits can yield increased performance. At the cost of an increase in the number of templates generated, the modified network, applying a threshold value uniformly over the limitrepresenting nodes at F−1 , yields a false color image with superior resolution compared with the resolution achievable with an unmodified ART-1 network. This example shows that the mathematical semantic model can be a useful guide for improving the ART-1 design. ACKNOWLEDGEMENT This work was supported in part by Sandia National Laboratories, Albuquerque, New Mexico, under contract no. 238984. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States 462 Department of Energy’s National Nuclear Security Administration under Contract DE-AC04-94AL85000. R EFERENCES [1] R. Andrews, J. Diederich, and A. B. Tickle. Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6):373–389, 1995. [2] G. Bartfai. Hierarchical clustering with ART neural networks. In Proceedings of the IEEE International Conference on Neural Networks, June 28–July 2, 1994, volume II, pages 940–944. IEEE, 1994. [3] G. A. Carpenter and S. Grossberg. A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37:54–115, 1987. [4] Thomas Preston Caudell, Yunhai Xiao, and Michael John Healy. eloom and flatland: Specification, simulation and vizualization engines for the study of arbitrary hierarchical neural architectures. Neural Networks, 16:617–624, 2003. [5] M. W. Craven and J. W. Shavlik. Learning symbolic rules using artificial neural networks. In Proceedings of the 10th International Machine Learning Conference, Amherst, MA, pages 73–80, San Mateo, CA, 1993. Morgan Kaufmann. [6] R L Crole. Categories for Types. Cambridge University Press, 1993. [7] A. C. Ehresmann and J.-P. Vanbremeersch. Information Processing and Symmetry-Breaking in Memory Evolutive Systems. BioSystems, 43:25– 40, 1997. [8] J. A. Goguen and R. M. Burstall. Institutions: Abstract model theory for specification and programming. Journal of the Association for Computing Machinery, 39(1):95–146, 1992. [9] Michael J. Healy. A topological semantics for rule extraction with neural networks. Connection Science, 11(1):91–113, 1999. [10] Michael J. Healy and Thomas P. Caudell. Acquiring rule sets as a product of learning in a logical neural architecture. IEEE Transactions on Neural Networks, 8(3):461–474, 1997. [11] Michael J. Healy and Thomas P. Caudell. A categorical semantic analysis of ART architectures. In IJCNN’01:International Joint Conference on Neural Networks, Washington, DC, volume 1, pages 38–43. IEEE,INNS, IEEE Press, 2001. [12] Michael J. Healy and Thomas P. Caudell. Aphasic compressed representations: A functorial semantic design principle for coupled ART networks. In The 2002 International Joint Conference on Neural Networks (IJCNN’02). Honolulu. (CD-ROM Proceedings), page 2656. IEEE,INNS, IEEE Press, 2002. [13] Michael J. Healy and Thomas P. Caudell. From categorical semantics to neural network design. In The Proceedings of the IJCNN 2003 International Joint Conference on Neural Networks. Portland, OR, USA. (CD-ROM Proceedings), pages 1981–1986. IEEE,INNS, IEEE Press, 2003. [14] Michael John Healy and Thomas Preston Caudell. Neural networks, knowledge, and cognition: A mathematical semantic model based upon category theory. Technical Report EECE-TR-04-020, Department of Electrical and Computer Engineering, University of New Mexico, June 2004. [15] R. Jullig and Y. V. Srinivas. Diagrams for software synthesis. In Proceedings of KBSE ‘93: The Eighth Knowledge-Based Software Engineering Conference, pages 10–19. IEEE Computer Society Press, 1993. [16] N. K. Kasabov. Adaptable neuro production systems. Neurocomputing, 13:95–117, 1996. [17] F. W. Lawvere and S. H. Schanuel. Conceptual Mathematics: A First Introduction to Categories. Cambridge University Press, 1995. [18] S. Mac Lane. Categories for the Working Mathematician. Springer, 1971. This is the standard reference for mathematicians, written by one of the two co-discoverors of category theory. The other was S. Eilenberg. [19] J. Meseguer. General logics. In Logic Colloquium ’87, pages 275–329. Science Publishers B. V. (North-Holland), 1987. [20] B. C. Pierce. Basic Category Theory for Computer Scientists. MIT Press, 1991. [21] Viggo Stoltenberg-Hansen, Ingrid Lindstroem, and Edward R. Griffor. Mathematical Theory of Domains. Cambridge University Press, 1994. [22] Stephen Vickers. Topology via Logic. Cambridge University Press, 1993. [23] Keith Williamson, Michael Healy, and Richard Barker. Industrial applications of software synthesis via category theory-case studies using specware. Automated Software Engineering, 8(1):7–30, 2001.