From: AAAI Technical Report SS-95-08. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved. Clustering and Information Sharing in an Ecology of Cooperating Agents Leonard N Foner Agents Group, MIT Media Lab E15-305, 20 Ames St Cambridge, MA 02139 foner@media.mit.edu [Shardanand95]) generally require a quadratic-order matching step somewhere,which makes them scale quite poorly. Often, this matchingstep must be performedin a single place, worsening the problemsof bottlenecks--such centralization is often the result of there being no effective wayfor peers to find each other, or no effective wayfor any clumpof peers to knowwhether a clumpsomewhereelse is related to them in any way. Other solutions have certainly been proposed and adopted. For example,hierarchical organization of the entities, as is done with resource records in the Internet domainname system [Mockapetris 87] or with newsgrouptopics in the Usenet [Horton 87], can help to reduce the inherently quadratic problem into a logarithmic one. However, such approaches depend on someinherent organizational principle that is established in advance, whichis neither alwaysoptimal nor always convenient (for example, consider the number of crosspostedUsenetarticles, a clear indication that singleinheritance hierarchies are not necessarily a goodmatch to the underlyingtopic space). Abstract Software agents have becomeincreasingly popular for a variety of applications, manyof whichbenefit from distribution on a network. However, many commonapproaches scale poorly in such an environment, because they assume the feasibility of knowing about all or most of any given agent’s peers. Weinvestigate here someapproaches and applications whichwill serve as a testbed for evaluating solutions to the above problems.These approachesinvolve clustering agents into clumpsor "niches" that need only local knowledge;this work hence has relevancein the fields of learning, knowledgesharing, and distributed AI. Parts of the ideas above have been implemented,but manyof themare speculative at this point, and feedback is actively encouraged. 1 Introduction Software agents have becomean increasingly popular approach in dealing with information discovery and information filtering in a networkedenvironment,either for the purposes of utility (e.g., [Maes94], [Lashkariet al 94], etc), entertainment (e.g., [Mauldin94] [Foner 93]). Manyforeseeable future applications for software agents (such as extensions of collaborative interface agents [Lashkari et al 94] or social filtering [Shardanand95], to pick just two of many) involve large numbers of agents interacting with each other. Users mayhave a numberof agents operating on their behalf, and agents of any particular user mayhave to communicatewith other agents elsewhere on the network in order to share information. The overall organization mayresemble a computational ecology [Huberman88], in which agents have only local knowledge,but self-organize into larger units. Such ecologies of agents present several scaling issues, however, one of which is howagents are supposed to find each other. Each agent should not have to knowabout (and, indeed, probably cannot knowabout) every other agent, user, or resource on the network. Perhaps a worst-case for such systems involve matchmaker-likeproblems, in which, from a set of peers, we must find good matches based on some attribute. Straightforward approaches to this problem(e.g., 2 A motivating example: Matchmaking The motivating examplefor this research is that of a matchmaker, whichcan be in one of several domains.In this paper, we will investigate the domainof newspaperclassified ads, transported to a an electronic context. Individual users (via their agents) submit either "buy" or "sell" ads (akin to messages in a for-sale newsgroup); these ads are then matched with each other. Weassumehere that each agent has a particular ad that correspondsto somerequest of its user. Bothbuyers and sellers post ads. Weuse the term "ad" and "agent" somewhatinterchangeably whentalking about communication;in particular, whenwe talk about an ad communicatingwith another ad, we really meanthe agent correspondingto that ad. For our purposes, it does not matter whetherthe agent (and its corresponding ad) stays on a particular workstation and simply opens network connections to other agents, or instead the agent itself (e.g., its thread of control) movesaroundthe networka la Telescript [Telescript 94]. However,for clarity of presentation, we assume that the agent stays put and opens network connections. 57 2.1 Problemswith typical approaches Clearly, every agent on the networkshould not have to see every ad, nor should there be a single clearinghouseof such ads or of all agents--neither approachscales. Newspapersgenerally force division into categories whenrunning classifieds, but such categories are often either too wide, too narrow, or off the markin someother wayfor their intended readership. Further, by centralizing the distribution of the ads, newspapers introduce artifacts of timing, geographicalregion, and so forth, whichmayalso be a poor matchfor particular readers. Posting and reading ads on the Usenet suffers from similar problems.Usenetposters by definition flood any post to all machines, in the same waythat every recipient of a newspaper gets every page of the newpaper’sclassified ads. Because manytopics are simultaneously too narrow along one dimension (causing crossposting to other groups) and too wide along others (causing manydissimilar ads to be groupedinto the samenewsgroup),potential readers must scan every article in a possibly large numberof newsgroupsto be sure of finding relevant ads. (If one does not a priori trust the imposedhierarchical clustering to cause advertisers to correctly cluster, then one maybe reduced to scanning all newsgroups, a 70-meg-per-dayproposition. Likewise, if one does not trust one’s readers to read appropriate newsgroups,one is tempted to crosspost to increasingly irrelevant newsgroups,as the recent "green card lawyers" incident demonstrates.) therefore experimenting with SMART [Yanoff 71] to establish a distance metric. Otheralternatives certainly exist, and will be investigated soon. For example, a semantic-net approach can yield less "discontinuous" matching (wherein "house and "apartment" are close in the concept space), at the possible cost of increased computationand a larger requirement that the agent knowsomethingabout the domainof the ad. In addition, a semantic-net approach offers promise of more interesting distance metrics between ads than simple similarity of keywords, such as via antonymical relationships--for example, we could use WordNet[Miller 93], which specifies a semantic net of common words, with help from part-of-speech tagging [Cutting et al 92], to find, e.g., negations such as "I do not wantto live in an apartment." 2.2.2 The clump space Weturn nowto actually using a distance metric to cause ads to agglutinate in clumpspace. The aim is to cluster ads on similar topics together, so that buyers need only search through ads which relate to what they wish to buy, and so forth. (A semantic-net approach could also allow us to split such a clumpof "buy" and "sell" ads into two subclumps, in whichall the buyers are likewise clustered apart fromthe sellers-we ignore such a refinement here for the moment.) Consider, for a moment,an individual ad. It "talks" only to neighboring ads. It can, however, pick up and moveto a 2.2 Somepotential solutions different neighborhoodalong somedimension, if it finds out from a neighbor that somenearby ad is closer to its meaning. This paper takes the problem of matchingclassified ads to (Again, it does not matter whether such "movement"corretheir intended readers in a somewhatdifferent direction. The sponds to a literal movementof the agent’s computationto fundamentalidea is to dynamicallycluster ads into clumpsloanother workstation, or instead correspondsto openinga netcated in a space. Oncea cluster (or clump)is formed,we need workconnection and talking to an agent that way.) only search a relatively small space to find relevant ads, rather than the space of all ads. In this way, ads migrate around the clumpspace, always Both buyers and sellers post ads. Adsonly talk to nearby talking only to local neighbors, gradually forming clusters. Eachad can talk to multiple different local neighborsat once, neighbors in the space. Since the clumpsare dynamicallycreated, a predefined hierarchy is avoided. Since communication where each such neighbor is a neighbor along a different dimensionin concept space. Thus, two ads, one of which says is strictly local, manyscaling effects are likewise reduced. "I’d like to buy a house" and the other of which says "I’m in This approach requires that we define a distance metric between any pair of ads, and a methodof clustering ads which the marketfor an apartment"will tend to be close in two important dimensions: both of them are posted by a buyer, and have small distances between them. Weactually have two both of them are housing-related. They will tend to form an spaces here: a concept space, consisting of howads would cluster if every ad knewabout every other ad, and a clump easily-findable clumpthat sellers can therefore find, by followinga gradient of links in conceptspace. 1 space, consisting of howthe ads are actually clustered, given their local knowledge, networkconnectivity, and so forth. Note that the schemeabove assumeswe have the conceptThe clumpspace is similar to but not exactly the sameas the space distance metric mentionedearlier. It also assumes we concept space. have two moreabilities: ¯ A way of finding peers 2.2.1 The concept space Thereare several possibilities for establishing a distance met- ¯ A wayof establishing a gradient in clumpspace ric betweenads and hence a concept space. Simplekeywordmatchingdefines a very irregular feature space; "house" and "apartment"are very far apart in a strict 1. If there wereonlya fewads in the entire network,this particular keywordmatch, but are semantically close nonetheless. Deexamplewouldnot workwithout a semantic-net distance metric, spite this shortcoming,keywordmatchingis quite fast, yields since a keyword-matcher wouldnot equate "house" and "aparta fairly simple one-dimensional distance between pairs of ment."However, if there are manyads in the system,weassumethat someof themwill use both"house"and "apartment"in the samead, ads, and is usable with only very small knowledgeof what whichwill tend to cause ads mentioningonly one or the other to words meanin the domain(e.g., we need to stem words, but clumpup with ads that mentionboth. It remainsto be seenhowefnot knowwhat they mean). For initial experiments, we are fectivethis strategywill be. 58 2.2.2.1 Finding peers long way to go, however, and feedback on these approaches in encouraged. Someof the more important unsolved pieces For ads to form clumps, they must be able to find other ads include: with which to clump. Anysolution which scales cannot rely ¯ The details of the distribution. Whichapplications are on a central registry of all ads or all agents, henceindividual best done with keyword-basedmatching, which with seads must have a way of joining an ad-hoc network of other mantic-net-based matching, and which with completely ads. different sorts of matching? Whatdoes this imply about To enable this, each agent keeps track of the location on the dimensionality of the resulting spaces? Howimporthe networkof previous agents it has encountered(e.g., other tant is visualization of the result? agents which have either initiated or respondedto communi¯ Issues of user privacy. Notice that a buyermust essentialcation). It neednot keepa completelist of all such agents, but ly advertise her interests to a large numberof a priori unsimply a large enoughset to makeit unlikely that all rememknownagents. If the item being sought is supposedto be bered agents form a clique in whichthere are no links to any secret, this is a problem.Parallel research is being conagents outside the clique. (This is to makethe probability of ducted into ways of using cryptographic protocols (such disconnectedspaces small.) The exact setting of this parameas zero-knowledgeproofs [Schneier 94] and various othter is unknown,but keepingtrack of the last 100-500agent loers) to help contain the resulting privacy exposures--the cations should be trivial and probablysufficient. test domainfor this research employsan application that Whenan agent is first created, it mustguess at contact inattempts to matchusers by their interests. Unfortunately, formation to find some other "connected" agent. There are a completedescription of these approachesis well outside manyheuristics for accomplishingthis, most of themoutside the scopeof this paper. of the scopeof this paper. Easytechniques include asking the ¯ Howto distribute such an application across the Internet user for likely machinesto contact, trying all locally-connectin a waythat makesit easiest for newusers to install and ed machines (via a broadcast or some other cheap mechause. Easeof installation is critical to quicklyestablishing nism) to see if any of themhave agents, or consulting an ada large user base, and such a large user base is essential to hoc or site-specific registry containingnearbylikely addressgive the communityof agents enough size to start doing es (such a registry need not knowabout all agents, of course, useful work. Such distribution also interacts with the merely a very small numberof them). Fromsuch contacts, an point above about privacy; mechanismsfor ensuring that agent can bootstrap its wayinto the networkby asking for rethe software received is trustworthy are important. ferrals, as described below. All of these areas are currently under active development. 2.2.2.2 Establishing a gradient In general, an ad attempting to find a suitable clumpasks someother ad, "Are you close to me in concept space?" using somedistance metric. If the answeris yes, the ad joins the clump. If the answeris no, however,the itinerant ad mustask for a referral to someother, morecompatiblead. If the referrals received were simply random(e.g., a complete listing of all ads previously encountered, or somerandomsubset of them), forming a clump would resemble randomdiffusion and wouldlikely be very slow. Instead, we attempt to establish a gradient in concept space that enables quicker formation of clumps. Wedo this by having each agent remember,in addition to which agents it has recently encountered, something about their ads as well. For example,we could rememberthe entire text of the last n ads, or somesummarization(such as a point in a vector space). Whena new agent comes looking for suitable referral, the agent being askedcould find the nearest rememberedad from one of its partners, and redirect the new 2agent as appropriate. 3 Conclusions By distributing the computationinto a large numberof small pieces, all of whichare relatively local and do not require knowledgeof the global state, this research has aimed to make large, networked information retrieval applications moretractable. It presents a typical exampleof an application that can benefit fromdistribution, and talks about waysto accomplishthe distribution of the computation.There is still a 59 4 References [Cutting et al 92] "A Practical Part-of-Speech Tagger", DougCutting, Julian Kupiec, Jan Pedersen, and Penelope Sibun, XeroxTechnical Report, 1992. [Foner 93] "What’s an Agent, Anyway?A Sociological Case Study", Leonard Foner, Agents Memo93-01, MIT Media Lab, Cambridge, MA,1993. (ftp://media.mit.edu/pub/Foner/Papers/ Agents--Julia.ps) [Horton 87] "Standard for interchange of USENET messages", Mark Horton and R Adams, Internet RFC1036.(ftp://ds.internic.net/rfc/rfc 1036.txt.) [Huberman 88] The Ecologyof Computation,B. A. Huberman,editor, Elsevier Science Publishers B.V., 1988. 2. Suchbehavioris altruistic, in that anygivenagenthas little incentiveto remember prior ads, thoughif there is uniformityin the programming of each agent, the community as a wholewill benefit, andso will eachindividualagent. Solutionswhichdependless on altruism are being investigated; however,they maynot be strictly necessary.If the overheadof such referrals is relativelysmall,andtheir utility sufficientlyobviousto the agents’ user community, then few users will be motivatedto eliminatethem,evenin the absenceof peer pressureor moredraconian,tit-for-tat responses. [Lashkariet al 94] "Collaborative Interface Agents", Yezdi Lashkari, MaxMetral, and Pattie Maes, Proceedings of the Twelfth National Conferenceon Artificial Intelligence, MITPress, Cambridge, MA, 1994. [Maes 94] "Agents that ReduceWorkand Information Overload", Pattie Maes, Communicationsof the ACM,July 1994, Vol 37 #7. [Mauldin 94] "ChatterBots, TinyMuds,and the Turing Test: Entering the Loebner Prize Competition", Michael Mauldin, Proceedings of the 7~velfth NationalConferenceon Artificial Intelligence, MIT Press, Cambridge, MA, 1994. (http:// fuzine.mc.cs.cmu.edu/mlm/aaai94.html) [Miller 93] "Introduction to WordNet:An On-line Lexical Database", George Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller, Princeton University Technical Report, 1993. [Mockapetris 87] "Domainnames - implementation and specification", Paul Mockapetris, Internet RFC1034.(ftp://ds.internic.net/rfc/ rfc1034.) [Shardanand 95] "Social Information Filtering: Algorithms for Automating ’Word of Mouth’", Upendra Shardanand and Pattie Maes, submitted to CHI-95, Denver, CO, May1995. [Schneier 94] Applied Cryptography, Bruce Schneier, John Wiley & Sons, 1994. [Telescript 94] Telescript Technology: The Foundation for the Electronic Marketplace, General Magic, Inc, 1994. [Zumoff71 ] "Users Manual for the SMART Information Retrieval System", Joel Zumoff,Cornell Technical Report 71-95.