From: AAAI Technical Report SS-93-04. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved. Associative Processing: A Paradigm for Massively Parallel James D. AI Roberts Computer Engineering Board and Computer and Information Sciences Board University of California Santa Cruz, CA 95064 donrob@cse.ucsc.edu 1 Introduction In associative memory,recall is based on similarity to a cue. With its inherent data parallelism, associative memorynaturally lends itself to implementation on massively parallel hardware; it is our thesis that associative processing can serve as the basis of AI systems. Webelieve that the associative paradigm can encompass both neural network and symbolic applications. Current research indicates that architectural innovations will permit massively parallel processing beyond brute force search and parallelizing vector-matrix multiplication neural network models. Weare developing a hardware architecture, the MISCMachine (Misc Is Symbolic Computing), to support associative processing for AI applications. It is a massively parallel hierarchical system combiningmultiple SIMD(Single Instruction stream, Multiple Data streams) arrays with a MIMD (Multiple Instruction streams, Multiple Data streams) processor network. In contrast to PlZOLOGand LISP engines, MISC is based on application rather than programming language requirements, and emphasizes association as the basis for intelligent systems. Our three target application areas are neural networks, marker-passing semantic networks, and conceptual graphs. Our approach analyses these applications for fundamental high-level associative operations. These high-level operations are then translated into machine primitives, emphasizing memorycontent-addressing via the SIMDarrays. Software development will be based on object and collection oriented languages such as SETL[Sch92, SDDS86].By working directly from AI application requirements, the system will be tailored to symbolic codes, with simplified programmingand a significant execution time speedup. 2 Associative Processing Associative memoryencompasses a wide variety of phenomenarelated to human memoryperformance [HA86, HMR86].Kohonensuggests that it is closely related to the semantic representation of knowledgein relational structures [KohS0]. He describes two types of associative memory.First, direct association, the most common usage of the term, refers to the recall of one pattern by the input of a cue pattern. Direct association provides a single input-to-output mappingbased on similarity of content, physical, temporal, or logical relations, and typically deals with ordered sets of attributes. Second, indirect association involves inference via multiple intermediate associative mappings. In indirect association, structural relationships are an important part of the patterns [KohS0]. In addition tomappings and inference, pattern completion from a partial or erroneous input is an important feature of associative memory. More importantly, most associative memorymodels include learning the cue-to-recall mappings. Content-addressable memory (CAM)is a physical embodiment of basic associative memory in which data is accessed by its content rather than by an address as in conventional computer memory.Associalive processing is an extension of CAM,permitting more sophisticated access as well as manipulation of the se- 187 lected data, providing the operations necessary to implement associative memory.Our associative processing model groups these lower-level operations into four categories: ¯ Matching operations include equivalence and magnitude comparisons of data primitives such as Boolean attribute sets, integer, and character strings, and are the basic operation of searching. Distance metric calculations support generalization and noise removal characteristic of associative memory.With continuity as the inductive bias, an associative memorycan be implemented by searching for all storage elements within some distance of the cue in the parameter space and then processing the values found. Partitioning operations divide memoryinto equivalence classes based on prior matching and distances. In addition to selecting active data elements for prior processing, partitioning can be used to extend the basic match operations into a variety of more sophisticated searches such as k-nearest neighbor. ¯ General operations can then be applied to one or more partitions of the data set. These can include readout of search responses, summation and thresholding, and set operations. All such operations can be executed on all or selected data structures simultaneously, permitting massive parallelism. Associative processing can also support recognition, classification, clustering, and dimension reduction. These functions can in turn serve as high-level operations in knowledgeprocessing. The versatile retrieval offered by associative processingcan support experience- (case-) based reasoning, heuristic search, and other knowledgeretrieval. Covering connectionist and symbolic processing, it is intended that our target paradigms of neural networks, semantic networks, and conceptual graphs will support the development of applications in diverse areas including natural language recognition and translation, machine vision, robotics and control systems, production systems, and information systems engineering as well as the wide variety of neural network applications [PF89, Jae89, EL92, M+92]. 3 Related Work The general literature shows that systems with CAM as an architectural element can offer significant speed and coding benefits for AI programs. Operations which can particularly benefit include: backtracking, clause filtering, conflict resolution, demanddriven reduction, garbage collection, indirection, inference, matching, the RETEalgorithm, search (A*), unification, and variable binding [Kog91, K+89, 0+89, M+90, RG91, NGC89].Applications which especially benefit from CAMinclude natural language, neural networks, pattern and symbol recognition, and semantic nets [M+90, WLL90,Kog91]. Several associative processors have already been developed by others, such as LUCAS and AAP2[FKS85, K+86]. These, however, provide only basic content-addressing functions typical of database machines. Parallel processors incorporating CAMinclude SNAPand IMX2, but these are restricted to semantic network processing [M+92, H+91]. Our research is developing a general purpose parallel processor for AI, fully supporting associative processing on complex data objects. 4 Application Areas One neural network model, Kanerva’s Sparse Distributed Memory(SDM)[Kan88], is particularly illustrative of the associative processing operations. In the basic model, SDMis a three-layer associative memoryin which the weights between the input and hidden layer are fixed, random and uniformly distributed. Hebbian learning is applied between the hidden and output layers. Input and output vectors are Boolean, and weights 188 are integers. 1 Kanerva has shownthat for long input vectors, the properties of a sparsely populated highdimensional Boolean space give rise to generalization and other associative memoryproperties [Kan88]. An m-bit input vector induces a 2m-dimensional parameter space. With an input vector hundreds or thousands of bits long, the parameter space can be only sparsely populated by storage locations. For access, a matching operation is performed and Hammingdistance calculated between the input vector and all the populated storage locations. All locations within a given distance of the input vector are then madepart of an activation set. For writing (learning), storage of the desired output is distributed amongthe membersof the activation set, each memberincrementing and decrementing counters in accordance with the desired output vector. On recall, the counters of the active set are summedfor each bit position and the result thresholded to produce the output. SDMthereby functions as a memorythat can learn associative mappings, generalize and interpolate, remove noise, and both generate and recognize sequences [Kan88]. For semantic networks, nodes are activated by content addressing, performing matching fully in parallel, followed by marker passing and other operations [Fah79, M+92]. Weare also investigating massively parallel associative implementation of conceptual graph knowledge processing [Sow91]. This research includes coordination with the PEIRCEinternational collaboration which is developing a conceptual graphs software workbench[EL92]. This knowledge representation is particularly demanding in its need for match and search involving structured (relational), often recursively defined patterns. Key operations include graph isomorphism and subsumption testing and partial order maintenance. Rapid retrieval of relations between entities is particularly important for analogical problem solving [CG92]. Associative processing can be especially important for large knowledge bases such as those derived from problem-solving experience. 5 The MISC Machine The associative processing operations required by the target applications form the basis of the MISCMachine design. Massive parallelism is necessary for the extreme speedups required to scale these models into large applications, particularly those involving multiple "mind agents" as proposed by Minsky [Min86]. The associative operations and unique characteristics of AI data structures and codes contributed to the overall MISCMachine architecture being a hierarchical system composed of multiple SIMDarrays, each supervised by a microprocessor forming part of a MIMDnetwork. Each SIMDarray serves as a separate associative processor, operating either independently or in combination with other arrays. Each supervisor controls its associative array, executes scalar operations and control structures too complex for the SIMDarray, and mediates communications. A full-scale MISCMachine might include millions of SIMDprocessing elements and thousands of supervisor processors, requiring a highly scalable communications network. Weare now refining this global architecture based on details of the target "applications." As suggested by the prior discussion of SDM,we are investigating associative memoryand neural network models at a higher level of abstraction than vector-matrix multiplication connectionist models. Similarly, MISCwill support symbolic approaches with more than brute force search, the MIMD-SIMD combination permitting sophisticated parallel algorithms and heuristics. In this way, an entire application, rather than just certain operations, will achieve significant speedup. Significant extension of the SIMDarchitecture will be necessary to deal with the dynamic, variable length, and linked data structures typical of AI codes. This is particularly important in extending CAMmatching from ordered lists of attributes to structured relations and other complex pattern matching operations. Enhancements include: ¯ Local addressing to permit greater flexibility in data structure access. ¯ Alternative instruction streams to improve processor utilization over simply masking out different groups of processors for conditional operations based on local data. 1 In recent variations, integer inputs are accommodated. 189 ¯ Combinationof bit-serial as distance calculation. and byte- or word-wideoperations, significantly accelerating operations such These features are part of recent trends in SIMDarchitecture research, and have been included to some degree in processors such as Blitzen, Non-Von, and MasPar [B+90, Sha88, Nic90]. In addition to taking such evolution further, the MISCMachinewill optimize features based on the unique high-level associative operations beneficial to AI codes in the target areas. Although the proposed SIMDarray architectural enhancements are intended to mitigate the limitations of SIMD,hybridization with MIMD processing will still be necessary to cope with the complex, data driven control structures of AI and permit the MISCMachine to be a full system rather than merely a knowledge base back-end. Considering the recursion and subroutine intensive nature of AI codes, as well as models such as the Warren Abstract Machine, a stack oriented processor architecture may be desirable. With the goal of a thousand or more supervisors, it will be crucial to avoid bottlenecks in the global communication network. The complex, data-dependent communication patterns of AI applications will make this more difficult than for scientific computation. One beneficial aspect of large-scale AI processing is that of progressive abstraction. For example, in advanced vision systems massive input is progressively reduced to a more compact and abstract representation. In such cases, the initial low-level communication is high bandwidth, but in highly regular patterns. Upper levels exhibit complex nondeterministic patterns, but data rates are muchlower. This global view of communication also contributed to the overall architecture combining massively parallel SIMDat the lowest levels and MIMD for the upper levels. Taking advantage of low-level regularity, communicationlocality, and data abstraction effects with hierarchical network topologies will minimize communication bottlenecks and make the MISCMachine highly scalable. 6 Conclusion The MISCMachine combines SIMDand MIMD processing in an architecture designed to scale to millions of associative processing elements and thousands of supervisor processors. Its development is based on the requirements of applications rather than programming languages. To date we have completed a block-level overall design and a more detailed design of the SIMDprocessing elements, and are currently performing simulations to improve the design in preparation for a system prototype. Wehave also developed an associative programming model and have shown that the target applications can be readily expressed in the model and efficiently executed on the MISCarchitecture. Wehypothesize that the associative processing paradigm, with data accessed and manipulated uniformly by content and equivalence class, will enable massively parallel execution and simplify programmingfor a broad range of AI research and applications. Both connectionist and symbolic paradigms can be supported, and architectural advancements are intended to remove the restriction of massively parallel AI operations to numerical connectionist models and brute-force knowledge bases. The MISCMachine should be especially good for applications using neural network, semantic network, conceptual graph, and expert system paradigms, particularly those requiring large knowledgebases of complex objects. The power of a full associative processing system maygreatly extend research capabilities and the tractable problem size of applications. Wehope that in the broadest sense this will enable previously impossible exploration of large-scale intelligent systems. Acknowledgements This research was supported in part by seed funds from the Division of Natural Sciences, University of California, Santa Cruz. The author thanks Robert Levinson and Richard Hughey for their assistance in parparing this paper. 190 References [B+90] D.W. Blevins et al. BLITZEN:A highly integrated Distributed Computing, 8:150-160, February 1990. massively parallel machine. J. Parallel [CG92] Darrell Conklin and Janice Glasgow. Spatial analogy and subsumption. In Derek Sleeman and Peter Edwards, editors, Machine Learning Proceedings of the Ninth International Workshop (ML92), San Mateo, CA, 1992. Morgan Kaufmann Publishers. [EL92] Gerard Ellis and Robert A. Levinson, editors. Proceedings of the First International Workshopon PEIRCE:A Conceptual Graphs Workbench. 1992. Held in association with the Seventh Annual Workshop on Conceptual Graphs, Las Cruces, NewMexico. [Fah79] Scott E. Fahlman. NETL: A System for Representing Press, Cambridge, Mass., 1979. [FKS85] Christer Fernstrom, Ivan Kruzela, and Bertil Design, Programming, and Application Studies. [H+91] Tetsuya Higuchi et al. IMX2: a parallel September 1991. [HA86] Geoffrey E. Hinton and James A. Anderson, editors. Press, Cambridge, Mass., 1986. and Using Real-World Knowledge. MIT Svensson. L UCASAssociative Array Processor: Springer-Verlag, NewYork, 1985. associative processor. ACMSIGARCH,19(3):22-33, Parallel Models of Associative Memory. MIT [HMR86] Geoffrey E. Hinton, James L. McClelland, and David E. Rumelhart. Distributed representations. In Parallel Distributed Processing. MITPress, Cambridge, Mass., 1986. [Jae89] L.A. Jaeckel. An alternative design for a sparse distributed memory.Technical report, Research Institute for Advanced Computer Science, NASAAmes Research Center, 1989. [K+86] Toshiro Kondo et al. Pseudo MIMDarray Architecture, pages 330-377, 1986. [K+89] Peter Kogge et al. VLSI and rule-based systems. In Jose G. Delgado-Frias and Will R. Moore, editors, VLSIfor Artificial Intelligence. Kluwer AcademicPublishers, Boston, Mass., 1989. [Kan88] Pentti Kanerva. Sparse Distributed [Kog91] Peter M. Kogge. The Architecture [Koh80] Tuevo Kohonen. Content-Addressable [M+90] Masato Motomuraet al. A 1.2 million transistor, 33-mhz, 20-b dictionary search (DISP) ULSI with a 160-kb CAM.Journal of Solid-State Circuits, 25(5):1158-1165, October 1990. [M+92] Dan Moldovan et al. SNAP:Parallel [Min86] processor -- AAP2. In 13th Conf. on Computer Memory. MIT Press, Cambridge, Mass., 1988. of Symbolic Computers. McGraw-Hill, NewYork, 1991. Memories. Springer-Verlag, NewYork, 1980. processing applied to AI. Computer, 25(5):39-49, Marvin Minsky. The Society of Mind. Simon and Suchster, May 1992. NewYork, 1986. [NGC89] Yah Ng, RaymondGlover, and Chew-Lye Chng. Unify with active memory. In Jose G. DelgadoFrias and Will R. Moore, editors, VLSI for Artificial Intelligence. Kluwer AcademicPublishers, Boston, Mass., 1989. [Nic90] John R. Nickolls. The design of the Maspar MP-I: A cost effective In Proceedings of COMPCON, pages 25-28, February 1990. 191 massively parallel computer. [0+89] Takeshi Ogura et al. A 20-kbit associative memoryLSI for artificial of Solid State Circuits, 24(4):1014-1020, August 1989. intelligence machines. Journal [PF89] R. W. Prager and F. Fallside. The modified Kanerva model for automatic speech recognition. Computer Speech and Language, 3(1):61-81, 1989. [RG91] D. P. Rodohanand R.. J. Glover. An overview of the A* architecture for optimization problem in a logic programming environment. ACMComputer Architecture News, 19(4):124-131, June 1991. [Sch92] Mario Schultz. An object-oriented interface in C++ to an associative processor. In CompEuro Proceedings Computer Systems and Software Engineering, Washington, D.C., 1992. IEEE Computer Society Press. [SDDS86] J. T. Schwartz, R.B.K. Dewar, E. Dubinsky, and E. Schonberg. Programming with Sets: Introduction to SETL. Springer-Verlag, NewYork, 1986. [Sha88] an David Elliot Shaw. Organization and operation of massively parallel machines. In Guy Rabbat, editor, Handbook of Advanced Semiconductor Technology and Compuler Systems. Van Nostrand Reinhold, NewYork, 1988. [Sow91] John F. Sowa, editor. Mateo, CA, 1991. Principles of Semantic Networks. Morgan Kaufmann Publishers, San [WLL90] Benjamin W. Wah, Matthew B. Lowrie, and Guo-Jie Li. Computers for symbolic processing. In Benjamin W. Wahand C. V. Ramamoorthy, editors, Computers for Artificial Intelligence Processing, pages 1-73. John Wiley & Sons, NewYork, 1990. 192