Clustering and Information Sharing in an

advertisement
From: AAAI Technical Report SS-95-08. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved.
Clustering and Information Sharing in an
Ecology of Cooperating Agents
Leonard N Foner
Agents Group, MIT Media Lab
E15-305, 20 Ames St
Cambridge,
MA 02139
foner@media.mit.edu
[Shardanand95]) generally require a quadratic-order matching step somewhere,which makes them scale quite poorly.
Often, this matchingstep must be performedin a single place,
worsening the problemsof bottlenecks--such centralization
is often the result of there being no effective wayfor peers to
find each other, or no effective wayfor any clumpof peers to
knowwhether a clumpsomewhereelse is related to them in
any way.
Other solutions have certainly been proposed and adopted. For example,hierarchical organization of the entities, as
is done with resource records in the Internet domainname
system [Mockapetris 87] or with newsgrouptopics in the
Usenet [Horton 87], can help to reduce the inherently quadratic problem into a logarithmic one. However, such approaches depend on someinherent organizational principle
that is established in advance, whichis neither alwaysoptimal nor always convenient (for example, consider the number
of crosspostedUsenetarticles, a clear indication that singleinheritance hierarchies are not necessarily a goodmatch to
the underlyingtopic space).
Abstract
Software agents have becomeincreasingly popular
for a variety of applications, manyof whichbenefit
from distribution on a network. However, many
commonapproaches scale poorly in such an environment, because they assume the feasibility of
knowing about all or most of any given agent’s
peers.
Weinvestigate here someapproaches and applications whichwill serve as a testbed for evaluating solutions to the above problems.These approachesinvolve clustering agents into clumpsor "niches" that
need only local knowledge;this work hence has relevancein the fields of learning, knowledgesharing,
and distributed AI.
Parts of the ideas above have been implemented,but
manyof themare speculative at this point, and feedback is actively encouraged.
1
Introduction
Software agents have becomean increasingly popular approach in dealing with information discovery and information filtering in a networkedenvironment,either for the purposes of utility (e.g., [Maes94], [Lashkariet al 94], etc),
entertainment (e.g., [Mauldin94] [Foner 93]).
Manyforeseeable future applications for software agents
(such as extensions of collaborative interface agents [Lashkari et al 94] or social filtering [Shardanand95], to pick just
two of many) involve large numbers of agents interacting
with each other. Users mayhave a numberof agents operating
on their behalf, and agents of any particular user mayhave to
communicatewith other agents elsewhere on the network in
order to share information. The overall organization mayresemble a computational ecology [Huberman88], in which
agents have only local knowledge,but self-organize into larger units.
Such ecologies of agents present several scaling issues,
however, one of which is howagents are supposed to find
each other. Each agent should not have to knowabout (and,
indeed, probably cannot knowabout) every other agent, user,
or resource on the network. Perhaps a worst-case for such
systems involve matchmaker-likeproblems, in which, from a
set of peers, we must find good matches based on some attribute. Straightforward approaches to this problem(e.g.,
2
A motivating example: Matchmaking
The motivating examplefor this research is that of a matchmaker, whichcan be in one of several domains.In this paper,
we will investigate the domainof newspaperclassified ads,
transported to a an electronic context. Individual users (via
their agents) submit either "buy" or "sell" ads (akin to messages in a for-sale newsgroup); these ads are then matched
with each other.
Weassumehere that each agent has a particular ad that
correspondsto somerequest of its user. Bothbuyers and sellers post ads. Weuse the term "ad" and "agent" somewhatinterchangeably whentalking about communication;in particular, whenwe talk about an ad communicatingwith another
ad, we really meanthe agent correspondingto that ad. For our
purposes, it does not matter whetherthe agent (and its corresponding ad) stays on a particular workstation and simply
opens network connections to other agents, or instead the
agent itself (e.g., its thread of control) movesaroundthe networka la Telescript [Telescript 94]. However,for clarity of
presentation, we assume that the agent stays put and opens
network connections.
57
2.1 Problemswith typical approaches
Clearly, every agent on the networkshould not have to see every ad, nor should there be a single clearinghouseof such ads
or of all agents--neither approachscales. Newspapersgenerally force division into categories whenrunning classifieds,
but such categories are often either too wide, too narrow, or
off the markin someother wayfor their intended readership.
Further, by centralizing the distribution of the ads, newspapers introduce artifacts of timing, geographicalregion, and so
forth, whichmayalso be a poor matchfor particular readers.
Posting and reading ads on the Usenet suffers from similar problems.Usenetposters by definition flood any post to all
machines, in the same waythat every recipient of a newspaper gets every page of the newpaper’sclassified ads. Because
manytopics are simultaneously too narrow along one dimension (causing crossposting to other groups) and too wide
along others (causing manydissimilar ads to be groupedinto
the samenewsgroup),potential readers must scan every article in a possibly large numberof newsgroupsto be sure of
finding relevant ads. (If one does not a priori trust the imposedhierarchical clustering to cause advertisers to correctly
cluster, then one maybe reduced to scanning all newsgroups,
a 70-meg-per-dayproposition. Likewise, if one does not trust
one’s readers to read appropriate newsgroups,one is tempted
to crosspost to increasingly irrelevant newsgroups,as the recent "green card lawyers" incident demonstrates.)
therefore experimenting with SMART
[Yanoff 71] to establish a distance metric.
Otheralternatives certainly exist, and will be investigated
soon. For example, a semantic-net approach can yield less
"discontinuous" matching (wherein "house and "apartment"
are close in the concept space), at the possible cost of increased computationand a larger requirement that the agent
knowsomethingabout the domainof the ad. In addition, a semantic-net approach offers promise of more interesting distance metrics between ads than simple similarity of keywords, such as via antonymical relationships--for example,
we could use WordNet[Miller 93], which specifies a semantic net of common
words, with help from part-of-speech tagging [Cutting et al 92], to find, e.g., negations such as "I do
not wantto live in an apartment."
2.2.2 The clump space
Weturn nowto actually using a distance metric to cause ads
to agglutinate in clumpspace. The aim is to cluster ads on
similar topics together, so that buyers need only search
through ads which relate to what they wish to buy, and so
forth. (A semantic-net approach could also allow us to split
such a clumpof "buy" and "sell" ads into two subclumps, in
whichall the buyers are likewise clustered apart fromthe sellers-we ignore such a refinement here for the moment.)
Consider, for a moment,an individual ad. It "talks" only
to neighboring ads. It can, however, pick up and moveto a
2.2 Somepotential solutions
different neighborhoodalong somedimension, if it finds out
from a neighbor that somenearby ad is closer to its meaning.
This paper takes the problem of matchingclassified ads to
(Again, it does not matter whether such "movement"corretheir intended readers in a somewhatdifferent direction. The
sponds to a literal movementof the agent’s computationto
fundamentalidea is to dynamicallycluster ads into clumpsloanother workstation, or instead correspondsto openinga netcated in a space. Oncea cluster (or clump)is formed,we need
workconnection and talking to an agent that way.)
only search a relatively small space to find relevant ads, rather
than the space of all ads.
In this way, ads migrate around the clumpspace, always
Both buyers and sellers post ads. Adsonly talk to nearby talking only to local neighbors, gradually forming clusters.
Eachad can talk to multiple different local neighborsat once,
neighbors in the space. Since the clumpsare dynamicallycreated, a predefined hierarchy is avoided. Since communication where each such neighbor is a neighbor along a different dimensionin concept space. Thus, two ads, one of which says
is strictly local, manyscaling effects are likewise reduced.
"I’d like to buy a house" and the other of which says "I’m in
This approach requires that we define a distance metric between any pair of ads, and a methodof clustering ads which the marketfor an apartment"will tend to be close in two important dimensions: both of them are posted by a buyer, and
have small distances between them. Weactually have two
both of them are housing-related. They will tend to form an
spaces here: a concept space, consisting of howads would
cluster if every ad knewabout every other ad, and a clump easily-findable clumpthat sellers can therefore find, by followinga gradient of links in conceptspace. 1
space, consisting of howthe ads are actually clustered, given
their local knowledge, networkconnectivity, and so forth.
Note that the schemeabove assumeswe have the conceptThe clumpspace is similar to but not exactly the sameas the
space distance metric mentionedearlier. It also assumes we
concept space.
have two moreabilities:
¯ A way of finding peers
2.2.1 The concept space
Thereare several possibilities for establishing a distance met- ¯ A wayof establishing a gradient in clumpspace
ric betweenads and hence a concept space.
Simplekeywordmatchingdefines a very irregular feature
space; "house" and "apartment"are very far apart in a strict
1. If there wereonlya fewads in the entire network,this particular
keywordmatch, but are semantically close nonetheless. Deexamplewouldnot workwithout a semantic-net distance metric,
spite this shortcoming,keywordmatchingis quite fast, yields
since a keyword-matcher
wouldnot equate "house" and "aparta fairly simple one-dimensional distance between pairs of
ment."However,
if there are manyads in the system,weassumethat
someof themwill use both"house"and "apartment"in the samead,
ads, and is usable with only very small knowledgeof what
whichwill tend to cause ads mentioningonly one or the other to
words meanin the domain(e.g., we need to stem words, but
clumpup with ads that mentionboth. It remainsto be seenhowefnot knowwhat they mean). For initial experiments, we are
fectivethis strategywill be.
58
2.2.2.1 Finding peers
long way to go, however, and feedback on these approaches
in encouraged. Someof the more important unsolved pieces
For ads to form clumps, they must be able to find other ads
include:
with which to clump. Anysolution which scales cannot rely
¯
The details of the distribution. Whichapplications are
on a central registry of all ads or all agents, henceindividual
best done with keyword-basedmatching, which with seads must have a way of joining an ad-hoc network of other
mantic-net-based matching, and which with completely
ads.
different sorts of matching? Whatdoes this imply about
To enable this, each agent keeps track of the location on
the dimensionality of the resulting spaces? Howimporthe networkof previous agents it has encountered(e.g., other
tant is visualization of the result?
agents which have either initiated or respondedto communi¯
Issues of user privacy. Notice that a buyermust essentialcation). It neednot keepa completelist of all such agents, but
ly advertise her interests to a large numberof a priori unsimply a large enoughset to makeit unlikely that all rememknownagents. If the item being sought is supposedto be
bered agents form a clique in whichthere are no links to any
secret, this is a problem.Parallel research is being conagents outside the clique. (This is to makethe probability of
ducted into ways of using cryptographic protocols (such
disconnectedspaces small.) The exact setting of this parameas zero-knowledgeproofs [Schneier 94] and various othter is unknown,but keepingtrack of the last 100-500agent loers)
to help contain the resulting privacy exposures--the
cations should be trivial and probablysufficient.
test domainfor this research employsan application that
Whenan agent is first created, it mustguess at contact inattempts to matchusers by their interests. Unfortunately,
formation to find some other "connected" agent. There are
a completedescription of these approachesis well outside
manyheuristics for accomplishingthis, most of themoutside
the scopeof this paper.
of the scopeof this paper. Easytechniques include asking the
¯ Howto distribute such an application across the Internet
user for likely machinesto contact, trying all locally-connectin a waythat makesit easiest for newusers to install and
ed machines (via a broadcast or some other cheap mechause. Easeof installation is critical to quicklyestablishing
nism) to see if any of themhave agents, or consulting an ada large user base, and such a large user base is essential to
hoc or site-specific registry containingnearbylikely addressgive the communityof agents enough size to start doing
es (such a registry need not knowabout all agents, of course,
useful work. Such distribution also interacts with the
merely a very small numberof them). Fromsuch contacts, an
point above about privacy; mechanismsfor ensuring that
agent can bootstrap its wayinto the networkby asking for rethe software received is trustworthy are important.
ferrals, as described below.
All of these areas are currently under active development.
2.2.2.2 Establishing a gradient
In general, an ad attempting to find a suitable clumpasks
someother ad, "Are you close to me in concept space?" using
somedistance metric. If the answeris yes, the ad joins the
clump. If the answeris no, however,the itinerant ad mustask
for a referral to someother, morecompatiblead.
If the referrals received were simply random(e.g., a complete listing of all ads previously encountered, or somerandomsubset of them), forming a clump would resemble randomdiffusion and wouldlikely be very slow. Instead, we attempt to establish a gradient in concept space that enables
quicker formation of clumps.
Wedo this by having each agent remember,in addition to
which agents it has recently encountered, something about
their ads as well. For example,we could rememberthe entire
text of the last n ads, or somesummarization(such as a point
in a vector space). Whena new agent comes looking for
suitable referral, the agent being askedcould find the nearest
rememberedad from one of its partners, and redirect the new
2agent as appropriate.
3
Conclusions
By distributing the computationinto a large numberof small
pieces, all of whichare relatively local and do not require
knowledgeof the global state, this research has aimed to
make large, networked information retrieval applications
moretractable. It presents a typical exampleof an application
that can benefit fromdistribution, and talks about waysto accomplishthe distribution of the computation.There is still a
59
4
References
[Cutting et al 92]
"A Practical Part-of-Speech Tagger", DougCutting, Julian
Kupiec, Jan Pedersen, and Penelope Sibun, XeroxTechnical
Report, 1992.
[Foner 93]
"What’s an Agent, Anyway?A Sociological Case Study", Leonard Foner, Agents Memo93-01, MIT Media Lab, Cambridge, MA,1993. (ftp://media.mit.edu/pub/Foner/Papers/
Agents--Julia.ps)
[Horton 87]
"Standard for interchange of USENET
messages", Mark
Horton and R Adams, Internet RFC1036.(ftp://ds.internic.net/rfc/rfc 1036.txt.)
[Huberman 88]
The Ecologyof Computation,B. A. Huberman,editor, Elsevier Science Publishers B.V., 1988.
2. Suchbehavioris altruistic, in that anygivenagenthas little incentiveto remember
prior ads, thoughif there is uniformityin the
programming
of each agent, the community
as a wholewill benefit,
andso will eachindividualagent.
Solutionswhichdependless on altruism are being investigated;
however,they maynot be strictly necessary.If the overheadof such
referrals is relativelysmall,andtheir utility sufficientlyobviousto
the agents’ user community,
then few users will be motivatedto
eliminatethem,evenin the absenceof peer pressureor moredraconian,tit-for-tat responses.
[Lashkariet al 94]
"Collaborative Interface Agents", Yezdi Lashkari, MaxMetral, and Pattie Maes, Proceedings of the Twelfth National
Conferenceon Artificial Intelligence, MITPress, Cambridge,
MA, 1994.
[Maes 94]
"Agents that ReduceWorkand Information Overload", Pattie
Maes, Communicationsof the ACM,July 1994, Vol 37 #7.
[Mauldin 94]
"ChatterBots, TinyMuds,and the Turing Test: Entering the
Loebner Prize Competition", Michael Mauldin, Proceedings
of the 7~velfth NationalConferenceon Artificial Intelligence,
MIT Press,
Cambridge, MA,
1994.
(http://
fuzine.mc.cs.cmu.edu/mlm/aaai94.html)
[Miller 93]
"Introduction to WordNet:An On-line Lexical Database",
George Miller, Richard Beckwith, Christiane Fellbaum,
Derek Gross, and Katherine Miller, Princeton University
Technical Report, 1993.
[Mockapetris 87]
"Domainnames - implementation and specification", Paul
Mockapetris, Internet RFC1034.(ftp://ds.internic.net/rfc/
rfc1034.)
[Shardanand 95]
"Social Information Filtering: Algorithms for Automating
’Word of Mouth’", Upendra Shardanand and Pattie Maes,
submitted to CHI-95, Denver, CO, May1995.
[Schneier 94]
Applied Cryptography, Bruce Schneier, John Wiley & Sons,
1994.
[Telescript 94]
Telescript Technology: The Foundation for the Electronic
Marketplace, General Magic, Inc, 1994.
[Zumoff71 ]
"Users Manual for the SMART
Information Retrieval System", Joel Zumoff,Cornell Technical Report 71-95.
Download