From: AAAI Technical Report SS-93-04. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved.
Associative
Processing:
A Paradigm for Massively Parallel
James
D.
AI
Roberts
Computer
Engineering
Board
and
Computer
and Information
Sciences
Board
University
of California
Santa
Cruz,
CA 95064
donrob@cse.ucsc.edu
1
Introduction
In associative memory,recall is based on similarity to a cue. With its inherent data parallelism, associative memorynaturally lends itself to implementation on massively parallel hardware; it is our thesis that
associative processing can serve as the basis of AI systems. Webelieve that the associative paradigm can
encompass both neural network and symbolic applications. Current research indicates that architectural innovations will permit massively parallel processing beyond brute force search and parallelizing vector-matrix
multiplication neural network models.
Weare developing a hardware architecture, the MISCMachine (Misc Is Symbolic Computing), to support
associative processing for AI applications. It is a massively parallel hierarchical system combiningmultiple
SIMD(Single Instruction stream, Multiple Data streams) arrays with a MIMD
(Multiple Instruction streams,
Multiple Data streams) processor network. In contrast to PlZOLOGand LISP engines, MISC is based
on application rather than programming language requirements, and emphasizes association as the basis
for intelligent systems. Our three target application areas are neural networks, marker-passing semantic
networks, and conceptual graphs. Our approach analyses these applications for fundamental high-level
associative operations. These high-level operations are then translated into machine primitives, emphasizing
memorycontent-addressing via the SIMDarrays. Software development will be based on object and collection
oriented languages such as SETL[Sch92, SDDS86].By working directly from AI application requirements,
the system will be tailored to symbolic codes, with simplified programmingand a significant execution time
speedup.
2
Associative
Processing
Associative memoryencompasses a wide variety of phenomenarelated to human memoryperformance [HA86,
HMR86].Kohonensuggests that it is closely related to the semantic representation of knowledgein relational
structures [KohS0]. He describes two types of associative memory.First, direct association, the most common
usage of the term, refers to the recall of one pattern by the input of a cue pattern. Direct association provides
a single input-to-output mappingbased on similarity of content, physical, temporal, or logical relations, and
typically deals with ordered sets of attributes. Second, indirect association involves inference via multiple
intermediate associative mappings. In indirect association, structural relationships are an important part of
the patterns [KohS0]. In addition tomappings and inference, pattern completion from a partial or erroneous
input is an important feature of associative memory. More importantly, most associative memorymodels
include learning the cue-to-recall mappings.
Content-addressable
memory (CAM)is a physical embodiment of basic associative memory in which
data is accessed by its content rather than by an address as in conventional computer memory.Associalive
processing is an extension of CAM,permitting more sophisticated access as well as manipulation of the se-
187
lected data, providing the operations necessary to implement associative memory.Our associative processing
model groups these lower-level operations into four categories:
¯ Matching operations include equivalence and magnitude comparisons of data primitives such as
Boolean attribute sets, integer, and character strings, and are the basic operation of searching.
Distance metric calculations support generalization and noise removal characteristic of associative
memory.With continuity as the inductive bias, an associative memorycan be implemented by searching
for all storage elements within some distance of the cue in the parameter space and then processing
the values found.
Partitioning operations divide memoryinto equivalence classes based on prior matching and distances.
In addition to selecting active data elements for prior processing, partitioning can be used to extend
the basic match operations into a variety of more sophisticated searches such as k-nearest neighbor.
¯ General operations can then be applied to one or more partitions of the data set. These can include
readout of search responses, summation and thresholding, and set operations.
All such operations can be executed on all or selected data structures simultaneously, permitting massive
parallelism.
Associative processing can also support recognition, classification, clustering, and dimension reduction.
These functions can in turn serve as high-level operations in knowledgeprocessing. The versatile retrieval
offered by associative processingcan support experience- (case-) based reasoning, heuristic search, and other
knowledgeretrieval. Covering connectionist and symbolic processing, it is intended that our target paradigms
of neural networks, semantic networks, and conceptual graphs will support the development of applications
in diverse areas including natural language recognition and translation, machine vision, robotics and control
systems, production systems, and information systems engineering as well as the wide variety of neural
network applications [PF89, Jae89, EL92, M+92].
3
Related
Work
The general literature shows that systems with CAM
as an architectural element can offer significant speed
and coding benefits for AI programs. Operations which can particularly benefit include: backtracking, clause
filtering, conflict resolution, demanddriven reduction, garbage collection, indirection, inference, matching,
the RETEalgorithm, search (A*), unification,
and variable binding [Kog91, K+89, 0+89, M+90, RG91,
NGC89].Applications which especially benefit from CAMinclude natural language, neural networks, pattern
and symbol recognition, and semantic nets [M+90, WLL90,Kog91].
Several associative processors have already been developed by others, such as LUCAS
and AAP2[FKS85,
K+86]. These, however, provide only basic content-addressing functions typical of database machines. Parallel processors incorporating CAMinclude SNAPand IMX2, but these are restricted to semantic network
processing [M+92, H+91]. Our research is developing a general purpose parallel processor for AI, fully
supporting associative processing on complex data objects.
4
Application
Areas
One neural network model, Kanerva’s Sparse Distributed Memory(SDM)[Kan88], is particularly illustrative
of the associative processing operations. In the basic model, SDMis a three-layer associative memoryin
which the weights between the input and hidden layer are fixed, random and uniformly distributed. Hebbian
learning is applied between the hidden and output layers. Input and output vectors are Boolean, and weights
188
are integers. 1 Kanerva has shownthat for long input vectors, the properties of a sparsely populated highdimensional Boolean space give rise to generalization and other associative memoryproperties [Kan88]. An
m-bit input vector induces a 2m-dimensional parameter space. With an input vector hundreds or thousands
of bits long, the parameter space can be only sparsely populated by storage locations. For access, a matching
operation is performed and Hammingdistance calculated between the input vector and all the populated
storage locations. All locations within a given distance of the input vector are then madepart of an activation
set. For writing (learning), storage of the desired output is distributed amongthe membersof the activation
set, each memberincrementing and decrementing counters in accordance with the desired output vector.
On recall, the counters of the active set are summedfor each bit position and the result thresholded to
produce the output. SDMthereby functions as a memorythat can learn associative mappings, generalize
and interpolate, remove noise, and both generate and recognize sequences [Kan88].
For semantic networks, nodes are activated by content addressing, performing matching fully in parallel,
followed by marker passing and other operations [Fah79, M+92]. Weare also investigating massively parallel associative implementation of conceptual graph knowledge processing [Sow91]. This research includes
coordination with the PEIRCEinternational collaboration which is developing a conceptual graphs software
workbench[EL92]. This knowledge representation is particularly demanding in its need for match and search
involving structured (relational),
often recursively defined patterns. Key operations include graph isomorphism and subsumption testing and partial order maintenance. Rapid retrieval of relations between entities
is particularly important for analogical problem solving [CG92]. Associative processing can be especially
important for large knowledge bases such as those derived from problem-solving experience.
5
The MISC Machine
The associative processing operations required by the target applications form the basis of the MISCMachine
design. Massive parallelism is necessary for the extreme speedups required to scale these models into large
applications,
particularly
those involving multiple "mind agents" as proposed by Minsky [Min86]. The
associative operations and unique characteristics of AI data structures and codes contributed to the overall
MISCMachine architecture being a hierarchical system composed of multiple SIMDarrays, each supervised
by a microprocessor forming part of a MIMDnetwork. Each SIMDarray serves as a separate associative
processor, operating either independently or in combination with other arrays. Each supervisor controls its
associative array, executes scalar operations and control structures too complex for the SIMDarray, and
mediates communications. A full-scale MISCMachine might include millions of SIMDprocessing elements
and thousands of supervisor processors, requiring a highly scalable communications network.
Weare now refining this global architecture based on details of the target "applications." As suggested
by the prior discussion of SDM,we are investigating associative memoryand neural network models at a
higher level of abstraction than vector-matrix multiplication connectionist models. Similarly, MISCwill
support symbolic approaches with more than brute force search, the MIMD-SIMD
combination permitting
sophisticated parallel algorithms and heuristics. In this way, an entire application, rather than just certain
operations, will achieve significant speedup.
Significant extension of the SIMDarchitecture will be necessary to deal with the dynamic, variable length,
and linked data structures typical of AI codes. This is particularly important in extending CAMmatching
from ordered lists of attributes to structured relations and other complex pattern matching operations.
Enhancements include:
¯ Local addressing to permit greater flexibility
in data structure access.
¯ Alternative instruction streams to improve processor utilization over simply masking out different
groups of processors for conditional operations based on local data.
1 In recent variations, integer inputs are accommodated.
189
¯ Combinationof bit-serial
as distance calculation.
and byte- or word-wideoperations, significantly
accelerating operations such
These features are part of recent trends in SIMDarchitecture research, and have been included to some
degree in processors such as Blitzen, Non-Von, and MasPar [B+90, Sha88, Nic90]. In addition to taking
such evolution further, the MISCMachinewill optimize features based on the unique high-level associative
operations beneficial to AI codes in the target areas.
Although the proposed SIMDarray architectural enhancements are intended to mitigate the limitations
of SIMD,hybridization with MIMD
processing will still be necessary to cope with the complex, data driven
control structures of AI and permit the MISCMachine to be a full system rather than merely a knowledge
base back-end. Considering the recursion and subroutine intensive nature of AI codes, as well as models
such as the Warren Abstract Machine, a stack oriented processor architecture may be desirable.
With the goal of a thousand or more supervisors, it will be crucial to avoid bottlenecks in the global
communication network. The complex, data-dependent communication patterns of AI applications will make
this more difficult than for scientific computation. One beneficial aspect of large-scale AI processing is that
of progressive abstraction. For example, in advanced vision systems massive input is progressively reduced
to a more compact and abstract representation. In such cases, the initial low-level communication is high
bandwidth, but in highly regular patterns. Upper levels exhibit complex nondeterministic patterns, but
data rates are muchlower. This global view of communication also contributed to the overall architecture
combining massively parallel SIMDat the lowest levels and MIMD
for the upper levels. Taking advantage of
low-level regularity, communicationlocality, and data abstraction effects with hierarchical network topologies
will minimize communication bottlenecks and make the MISCMachine highly scalable.
6
Conclusion
The MISCMachine combines SIMDand MIMD
processing in an architecture designed to scale to millions
of associative processing elements and thousands of supervisor processors. Its development is based on the
requirements of applications rather than programming languages. To date we have completed a block-level
overall design and a more detailed design of the SIMDprocessing elements, and are currently performing
simulations to improve the design in preparation for a system prototype. Wehave also developed an associative programming model and have shown that the target applications can be readily expressed in the model
and efficiently executed on the MISCarchitecture.
Wehypothesize that the associative processing paradigm, with data accessed and manipulated uniformly
by content and equivalence class, will enable massively parallel execution and simplify programmingfor a
broad range of AI research and applications. Both connectionist and symbolic paradigms can be supported,
and architectural advancements are intended to remove the restriction of massively parallel AI operations
to numerical connectionist models and brute-force knowledge bases. The MISCMachine should be especially good for applications using neural network, semantic network, conceptual graph, and expert system
paradigms, particularly those requiring large knowledgebases of complex objects. The power of a full associative processing system maygreatly extend research capabilities and the tractable problem size of applications.
Wehope that in the broadest sense this will enable previously impossible exploration of large-scale intelligent
systems.
Acknowledgements
This research was supported in part by seed funds from the Division of Natural Sciences, University of
California, Santa Cruz. The author thanks Robert Levinson and Richard Hughey for their assistance in
parparing this paper.
190
References
[B+90]
D.W. Blevins et al. BLITZEN:A highly integrated
Distributed Computing, 8:150-160, February 1990.
massively parallel
machine. J. Parallel
[CG92]
Darrell Conklin and Janice Glasgow. Spatial analogy and subsumption. In Derek Sleeman and Peter Edwards, editors, Machine Learning Proceedings of the Ninth International Workshop (ML92),
San Mateo, CA, 1992. Morgan Kaufmann Publishers.
[EL92]
Gerard Ellis and Robert A. Levinson, editors. Proceedings of the First International Workshopon
PEIRCE:A Conceptual Graphs Workbench. 1992. Held in association with the Seventh Annual
Workshop on Conceptual Graphs, Las Cruces, NewMexico.
[Fah79]
Scott E. Fahlman. NETL: A System for Representing
Press, Cambridge, Mass., 1979.
[FKS85]
Christer Fernstrom, Ivan Kruzela, and Bertil
Design, Programming, and Application Studies.
[H+91]
Tetsuya Higuchi et al. IMX2: a parallel
September 1991.
[HA86]
Geoffrey E. Hinton and James A. Anderson, editors.
Press, Cambridge, Mass., 1986.
and Using Real-World Knowledge. MIT
Svensson. L UCASAssociative Array Processor:
Springer-Verlag, NewYork, 1985.
associative
processor.
ACMSIGARCH,19(3):22-33,
Parallel Models of Associative Memory. MIT
[HMR86] Geoffrey E. Hinton, James L. McClelland, and David E. Rumelhart. Distributed representations.
In Parallel Distributed Processing. MITPress, Cambridge, Mass., 1986.
[Jae89]
L.A. Jaeckel. An alternative design for a sparse distributed memory.Technical report, Research
Institute
for Advanced Computer Science, NASAAmes Research Center, 1989.
[K+86]
Toshiro Kondo et al. Pseudo MIMDarray
Architecture, pages 330-377, 1986.
[K+89]
Peter Kogge et al. VLSI and rule-based systems. In Jose G. Delgado-Frias and Will R. Moore,
editors, VLSIfor Artificial Intelligence. Kluwer AcademicPublishers, Boston, Mass., 1989.
[Kan88]
Pentti Kanerva. Sparse Distributed
[Kog91]
Peter M. Kogge. The Architecture
[Koh80]
Tuevo Kohonen. Content-Addressable
[M+90]
Masato Motomuraet al. A 1.2 million transistor,
33-mhz, 20-b dictionary search (DISP) ULSI
with a 160-kb CAM.Journal of Solid-State Circuits, 25(5):1158-1165, October 1990.
[M+92] Dan Moldovan et al. SNAP:Parallel
[Min86]
processor
-- AAP2. In 13th Conf. on Computer
Memory. MIT Press, Cambridge, Mass., 1988.
of Symbolic Computers. McGraw-Hill, NewYork, 1991.
Memories. Springer-Verlag,
NewYork, 1980.
processing applied to AI. Computer, 25(5):39-49,
Marvin Minsky. The Society of Mind. Simon and Suchster,
May 1992.
NewYork, 1986.
[NGC89] Yah Ng, RaymondGlover, and Chew-Lye Chng. Unify with active memory. In Jose G. DelgadoFrias and Will R. Moore, editors, VLSI for Artificial Intelligence. Kluwer AcademicPublishers,
Boston, Mass., 1989.
[Nic90]
John R. Nickolls. The design of the Maspar MP-I: A cost effective
In Proceedings of COMPCON,
pages 25-28, February 1990.
191
massively parallel
computer.
[0+89]
Takeshi Ogura et al. A 20-kbit associative memoryLSI for artificial
of Solid State Circuits, 24(4):1014-1020, August 1989.
intelligence machines. Journal
[PF89]
R. W. Prager and F. Fallside. The modified Kanerva model for automatic speech recognition.
Computer Speech and Language, 3(1):61-81, 1989.
[RG91]
D. P. Rodohanand R.. J. Glover. An overview of the A* architecture for optimization problem in
a logic programming environment. ACMComputer Architecture News, 19(4):124-131, June 1991.
[Sch92]
Mario Schultz. An object-oriented interface in C++ to an associative processor. In CompEuro
Proceedings Computer Systems and Software Engineering, Washington, D.C., 1992. IEEE Computer Society Press.
[SDDS86] J. T. Schwartz, R.B.K. Dewar, E. Dubinsky, and E. Schonberg. Programming with Sets:
Introduction to SETL. Springer-Verlag, NewYork, 1986.
[Sha88]
an
David Elliot Shaw. Organization and operation of massively parallel machines. In Guy Rabbat,
editor, Handbook of Advanced Semiconductor Technology and Compuler Systems. Van Nostrand
Reinhold, NewYork, 1988.
[Sow91] John F. Sowa, editor.
Mateo, CA, 1991.
Principles
of Semantic Networks. Morgan Kaufmann Publishers,
San
[WLL90] Benjamin W. Wah, Matthew B. Lowrie, and Guo-Jie Li. Computers for symbolic processing.
In Benjamin W. Wahand C. V. Ramamoorthy, editors, Computers for Artificial
Intelligence
Processing, pages 1-73. John Wiley & Sons, NewYork, 1990.
192