From: AAAI Technical Report WS-99-12. Compilation copyright © 1999, AAAI (www.aaai.org). All rights reserved. Smaller, Faster Agent Dialogues via Conversational Probing t Tim Menzies~, Bojan Cukic ¶, Enrico Coiera tNASA/WVU Software Research Lab, 100 University Drive, Fairmont, USA ¶Departmentof ComputingScience and Electrical Engineering, West Virginia University, Morgantown,WV tDepartment of ComputerScience and Engineering, University of NewSouth Wales, Australia 2052 < tim@menzies,com, cukic@csee,wvu. edu, ewc@pobox,com> While agent models are in conflict, we broaden the bandwidth between them. Whenmodels are in consensus, we restrict the bandwidth by assuming shared knowledgebetween agents. As a result, bandwidth is conserved during conversationuntil it is explicitly demanded to reconcile the world modelsof conversing agents. Timeis also conserved by minimising unnecessary sharing of knowledgebetween agents (Coiera 1999). This "assume-consensus" assumption has another advantage. Consider agents with a learning component. Such agents mayupdate their world model. If they wish to coordinate with other agents, then they wouldneed to reflect on the beliefs of their fellows. In the worst case, this means that each agent mustmaintain one belief set for every other agent in its community.This belief set should include what the other agent thinks about all the other agents. If that second-level belief set includes the original agents beliefs, an infinite regress mayoccur (e.g. "I think he thinks I think that he thinks that I think that."). This "assume-consensus" modeladdresses the infinite regress problem. Agentsin assumedconsensus only need to store their ownbeliefs and one extra axiom;i.e. "if I believe that she believes what I do, then mybeliefs equals her beliefs". Toapply this approach,agents haveto continuallytest that other agents hold their beliefs. Onemethodfor doing this wouldbe to ask agents to dumptheir belief sets to each other. Weconsider this approachimpractical for three reasons: ¯ It incurs the penalty of the infinite regress, discussed above. ¯ Such belief-dumps could exhaust the available bandwidth. ¯ Agentsmaynot wish to give other full access to their internal beliefs. For example,security issues mayblock an agent from one vendor accessing the beliefs of an agent from another vendor. Without access to internal structures, howcan one agent assess the contents of another? Software engineering has one answer to this question: black-box specification-based probing (Hamlet & Taylor 1990; Lowrey, Boyd, & Kulkarni 1998). Agent-Acould log its ownbehaviour to generate library of input-output pairs. In doing so, Agent-Ais using itself as a specification of the expectedbehaviourof AgentB, assuming our two agents are in consensus. If Agent-A Abstract Studies of humanconversation suggest that agents whose world modelsare in consensuscan workwell together using only very narrowbandwidths.The total bandwidthrequired betweenagents couldhencebe minimisedif wecould recognise whenmodelconsensusbreaks down.At the breakdownpoint, the communicationpolicy could switch from someusual-case low value to a temporaryhigh value while the modelconflict is resolved. Toeffectively recognisethe breakdown point, we needtools that recognise modelconflicts withoutrequiring extensivebandwidth.Amathematical modelof probingand-orgraphssuggeststhat, for a large rangeof interesting models,the numberof probesrequiredto detect consensusbreakdown is quite low. Keywords:Agent communication,Negotiation protocols, Negotiatingto maintaincoalitions, Facilitating the negotiation process Agents running on distributed computers mayinteract under a variety of resourcerestrictions. Onerestriction is bandwidth. Every transmission also takes time since it must be processed by both the sender and the receiver. Studies of humancommunication suggests that failure to optimise the use of available resources can result in inefficient communication within organisations and consequenttask errors (Coiera 1996). Howcan we minimize the use of those resources for agents, and maximisethe likelihood that communication tasks are successfully completed? Consider two humanscommunicatingvia a channel with a certain bandwidth. A robust finding from humanconversations is that restricting bandwidthhas little impacton the effectiveness of the outcomesof manycollaborative problem solving tasks (McCarthy& Monk1994). This occurs when the communicatingindividuals have a high degree of shared common ground. A particularly striking finding is that humans assume they share commonworld views in conversation until they detect an error, and then attempt to probethe cause of the error. Interestingly, humanconversation seems to consist of both an explicit small set of exchangesdevoted to testing or confirming understanding, as well as the primary exchange (J Clarke 1991). Can we use these observations to minimizethe bandwidth required between computational agents? Suppose agents move through alternating stages of assumed model consensus, detection of modelconflict, and then modelrepair. 24 4603 .~ c= "6 ._~ .~ o 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 / 1 failure in 100 .-e--. ,~ 1 failure in 1,000--~--. ,/ 1 failure in 10,000 j’= 1 failure In 1001000 -.x---" /" lfailurein 1,000,000.-=,--" 10 100 1000 10000 # of tests 100000 le+06 la+07 le+08 Figure 1: C = 1 - (1 - a)N. Theoretically, 4603 probes are required to achieve a 99%chance of detecting moderately infrequent items; i.e. ones that only occupywhich10--~-th of the model. (Hamlet&Taylor 1990). sees those outputs whenAgent-Bis presented with those inputs, then Agent A could infer that Agent B has the same beliefs as itself. Unfortunately, black-box probes can be very bandwidth expensive. Equation 1 showsa widely-usedstatistical model connecting the numberof probes N to the confidence C that we will find an error with frequencyof occurrence~. iv C = 1- (1 - a) The simulation modelalso offers clear guidelines as to which of these two behaviours will occur. That is, when we design agents, the simulation modelcan indicate what systemfeatures allow agents to quickly assess each other. Twocaveats before continuing. This paper presents a probabilistic modelof howagents of a particular form (an and-or graph) can assess each other. Hence: ¯ It offers average case results whichwill be inaccurate in certain circumstances. For mission-critical systems, our average-caseanalysis is hence inappropriate. ¯ The psychological likelihood of this modelis questionable and we should not use these results to makestateI. ments about howhumansshould co-ordinate (I) Equation 1 saysthatforprobes to be 99%certain of detecting anomalies, thenthenumber ofblack-box probes required is4.6times divided bythefrequency ofthatanomaly (seeFigure I).Forexample, tobe 99%certain thatAgent-B haslessthan1%difference initsbeliefs toAgent-A, then 460probeswouldbe required. Thisstatistical modelis A Model of Probing henceverypessimistic on thepossibility ofAgent-A accuRoughly speaking, probing is a process of finding a needle rately assessing itsconsensus withAgent-B, without using in a haystack. Whatare the odds of finding somerandom large bandwidths fortheprobing. Thispaper willargue thatFigure I isa gross over-estimateneedle? To answer this question, we have committo some model of a program. The following model applies to any of thenumber of probes. Figure I assumes noknowledge of program which can be reduced to a directed and-or graph theinternal structure of ouragents. If wecommit to some betweenconcepts. For example, a propositional rule base is viewofstructure, thenestimates canbe generated forthe oddsof finding differences between Agent-A andAgent-B. such an and-or graph wherethe primitive conceptsare (e.g.) a=true or a=false. Hence, assuming thatagents areusually inconsensus, then wecanreduce inter-agent bandwidth asfollows: Graphs ¯ Assume consensus andrestrict bandwidth. Werepresent a programa directed-cyclic graph G contain¯ Aspartofthenormal inter-agent dialogue, Agent-A occaing vertices {V/, Vj, ...} and edges E. G has roots roots(G) sionally drops a probe question to Agent-B. Ifthenumber and leaves leaves(G). A vertex is one of two types: ofprobes required isverysmall, thenthese process will ¯ Orscan be believed if any their parents are believed or it addlittle totheoverall bandwidth. has been labeled an In vertex (see below). The average ¯ Ifconflicts aredetected, increase thebandwidth between numberof parents for the ors is Agent-A andAgent-B toallowfordiscussions. Therestofthispaper isstructured asfollows. First, we parentsor = IParents(°rs(G))l (2) Ios(G) present a simple mathematical model of theoddsof reaching somerandomnodein an and-orgraphfromsomeinputs. Each or-node contradicts 0 or more other or-nodes, deSecondly, we present simulation results suggesting thatthe noted no(V/) = {Vj, Vk, ...}. The average size of the no average oddsoffinding a random nodehasonlyoneoftwo sets for all V/is denoted constraints(G). behaviours: 1Exception:in the special case wherehumans are collaborating ¯ Theoddsquickly asymptotes toa highplateau to test software,theseresults couldbe usedto optimisetheir testing ¯ Theoddsquickly falltonearly zero. efforts. 25 ¯ Andscan be believed if all their parents are believed. The average numberof parents for the ands is parentsand = Iparents(ands(G))l lands(G) Prediction 1 If and-nodes dominate G, then OddsRwill decrease very rapidly. If or-nodes dominate G, then OddsR 2. will increasevery rapidly Also, from Equation5 and 8 it follows that: (3) Proofs Prediction The aim of our probing is to generate proofs across G to someoutput goal Outi. Outi comesfrom a test suite. A test suite is the database {< In1, Out1 >, ... < Inn, Outn >}. Wealso assume that each Ini,Outi contains only orvertices. Each output Outi can generated 0 or more proofs {Px, Py, ...}. Wemakeno other commenton the nature of Outi: it maybe someundesirable state or somedesired goal. In and Out are sets of vertices from G. A proofP C_ G is a tree containing the vertices uses(Px) = {Vi, Vj, ...}.The proof tree has: ¯ Exactly one leaf whichis an output; i.e. The probability of believing somearbitrary Vi is somecombination of ¯ Theprobability that it is an and-vertexand it is believed. ¯ Theprobability that it is an or-vertex and it is believed. For the moment,we will assumethat the combiningfunction is the maximum of the two probabilities (and experiment withthis later). Thatis: ’~d lands(G) ’ Ileaves(uses(Px))l= 1A V~ E leaves(uses(Px)) h V~ E Out no(Vold) ) Hence: Prediction 3 Weexpect OddsRHto be asymptotic to some percentageof the ratio of ands~orsin G. (4) Odds of OddsOKH Let OddsOKH be the probability of the one world assumption; i.e. the odds that a newvertex at height H can be added into the current proof of height H - 1 without contradicting anythingelse in that proof. At H = 1, we are adding a vertex into an empty proof; hence, the chances of a new vertex contradicting existing proof vertices is zero and: Oddsof Finding Outi Based on the above, we can compute OddsRH,the odds of reaching some arbitrary output Vi E Out use a proof tree of height H. The section assumes that the probability that a new vertex does not conflict with the no sets already in a proof with odds OddsOKH (this probability is computed later). Anoutput in a tree of height H = 1 can only be believed if it is also an input. Onlyor-vertices can be inputs, so: r OddsR~ r - IV-Ilnl ~ * OddsOK~ nd OddsR~ = 0 OddsOK1 = 1 OddsOK~ind = 1 (11) For H > 1, the odds of an or-vertex contradicting another or-vertex on the proof is the complimentof the sum of the numberof or-vertices on that proof times the frequency of constraint violations: Otherwise, for H > 1: ¯ If ~ is an and-vertex, then we believe it with the probability of believingall its parents; i.e. OddsO K~nd (7) ¯ If Vi is an or-vertex, then webelieve it withthe probability of believinganyof its parents; i.e. OddsR~ = (parentsor) (10) And-vertices have no no set. Hencefor all H: (5) (6) nd= ( OddsRn-1)Parentsana OddsRaH ~OddsR~l The maximumvalue for OddsRH is when OddsR~nd or nd is 1, i.e. OddsR~ ¯ 1 or more roots roots(uses(P,)) C_ ¯ Height height(P,) being the largest pathway from the leaf to the roots. Whengrowing a proof, a new vertex is added to the set of vertices already on that proof. The newvertex must not contradict the vertices already in the proof; that is, the new vertex Vnewmust not satisfy: Votd e uses(Px) A Vnewe 20ddsRH c< ~2~ , (12) Prediction 4 Unless constraints(G) is a very large fraction oflVI,then OddsOK will be very large. * OddsRH_l * OddsOKff (8) From Equations 7 and 8, we make the following predictions: 2Notethat this prediction,andthe onesto follow,are notcertain inferencessince feedbackfactors mayinfluenceour results. 26 IWl = 554.. 55400 ISnl = 4 constraints(G) lands(G)l = = 2..40 0.6" IVl parentsand parentsor = = normal_dist(#pands, apands) normal-dist(#pors, apors) #pands ---- 2 #pots = ---- Z $ #ands = = = = Y * #pots 0.. 0.5 0.5..2.0 1..20 Iors(a)l = IVl- lands(a)l= 0.4 1 0.1 0.01 0.001 0.0001 le-05 1 e-06 IV1=554;constraints(G)=2 i j¢~~=~=0.50 ..t, .... z=0.75..÷ .... ,)"/~ ~,,~= z=1.00.~ .... ....................... z=1.25 -x--. .~ " ~ z=1.50 ,.. "[9 z=1.75--x--~] Z=2.00 -e-~. ... "El " ~ "t9 ". ". D "Q "~tk’., ’[9 ¯ ~’N. O’pand s apors y z repeats ’~, ~ t, Table 1: Simulationparameters 20 1 0.1 0.01 0.001 0.0001 le-05 40 H IV1=554; constraints(G)=40 i t~~e~=0.50 ..e .... " ¢’,,’~ ~ Z=0.75 "’÷ .... z=l.00-G .... ~_~ Z=1.25 -x--~=1.50 "-=’-’= "~’=1.75 "e~,..._ "°..rq Z=2.00--e--’...:~ "[9 Q. "El ~. ¥’,,, ’... "E].E] Simulations Table 1 shows parametersfor an and-or graph from the real-world modelof neuroendocrinology(Smythe1989) explored previously by Menzies & Compton(Menzies Compton 1997). Initial experimentswith our formulaeassumedy = 0, repeats = 1; i.e. the numberof parents of and-nodesand or-nodeswasconstant. Theresults of the y = 0, repeats = i run are shownin Figure2. Notethat for a range of graphsizes and numberof constraints, the same pattern emerges: ¯ Confirmingpredictions 1 and 2, whenor-nodes dominate (above z = 1.25), the odds of reaching some random nodes asymptotesto the percentage of or-nodes in the graph (in these simulations, es]~ = 0.4). Whenandnodesdominate(below z = 1.25), the oddsof reaching nodedropsdramatically. ¯ Confirmingprediction 3, OddsR1of the top two plots of Figure 2 are 100 times larger than OddsRzof the bottom two plots. This follows since the ratio of ~ decreased by a factor of 100 whenIYlmovedfrom 554 to 55400 andIINIremainedconstant. ¯ Confirmingprediction 4, an order of magnitudechange in constraints(G) had little effect on OddsRHsince (constraints(G) = 40) << (IYl = 554). ¯ In the case of the asymptoticeffect, after a certain proof depth, the odds of reaching a node are not improvedby further searching. In these simulations, that point was H = 10 (for small theories with IVI = 554) and H = (for large theories with IVI = 55400). In summarynodes are either very reachable or barely reachable, dependingon the average numberof and-node parents andthe average-number of or-node parents. Hence, if we design our agents with moreor-parents than andparents,these results suggestthat the agentswill be able to check for consensusvery easily. Figure 3 is an expansion of the far left-hand-side of Figure 1 andshowsthat (e.g.) 4 probes gives a 87%confidence of finding a node when OddsRH= 0.4. 4 probes is a very small overhead to an I e-06 20 1 0.1 0.01 0.001 0.0001 1 e-05 I e-06 1 40 H IV1=55400;constraints(G)=2 i ,/’e-e:2~z=0.50 --e .... /" 2( .,,~=0.75--÷.... ,~ 2( d 2( /2( 0.001 0.0001 le-05 z=1.75--,vc-z=2.00 : ;-" ;-" ;~ ~-"X XX X)( )( "El ~."Eg ’. "El ’~’ "E]"E]’~ , I 20 40 H IV1=55400;constraints(G)=40 i ,~ ;/~===z=o.5o --<, .... /" 2( .,,==0.75 .-+.... 0.1 0.01 . ~z=l.OO --[9.... z=1.25 z=1.5o-~- ,22( d /2( /~2( ~¢ 2( 2( V . ~ z=l.00 "o .... z=1.25--x--z=1.50 Z=1.75 Z=2.00--e-- ’, ~:. X X X X X X X X X X "El ¥ "~ ;,".,, "E],. 1 e-06 20 y * ~tpand s 40 H Figure2: Simulationresults at y = 0, repeats = 1. 27 o=OddsR(H) i i i 100 / ’ QoF " 0.9 so ~ 0.7 0 5 10 15 Number of tests H<I~o ---- \ ~ ’~ H<75 H<50 --e.-I-[<25 -- H<125 H<lOO -8-- 40 30 20 10 20 0 0.1 0.2 0.3 0.4 0.5 Y z=1.05 , , , ~ ~ H<150 H<125 H<100 H<75 H<50 ~<25 Figure 3: C = 1--(1--OddsRH)N; 0.2 _< OddsRH< 0.8. 100 90 f 80 70 60 50 4O 3O 20 10 on-going conversation. On the other hand, if we makethe internal logic of our agents very complex(more and-nodes than or-nodes) then it will be very hard to monitor consensus without requiring manyprobes and a very large bandwidth. ExternalValidity This section explores the generalizability of the above results. There are at least three challenges to the aboveresults. Firstly, they assumea simple modelof the graph being processed; i.e. uniform distributions of and-parents and orparents. Howsensitive are the above findings to changes in the distributions? To explore this issue, the runs of Figure 2 were repeated several times with increasing variance in the distributions on nodeparents. In terms of Table1, this meant runs with repeats = 20, y = 0..0.5. The results of these runs is shownas: 0.1 0.2 0.3 0.4 0.5 Y z=1.55 100/ m c m ’ ’ ’ ’ 90 ~ 80I 70 60 50 40 ’ H<150 H<125--x-H<100--B-H<75 H<50--e.-H<25 -- 20 10 ¯ The standard error on the meanof OddsR... ¯ Expressed as a percentage of the mean(e.g. a 50%standard error means that our expectation of X varies from 0.5X to 1.5 ¯ X). 0 0.1 0.2 0.3 0.4 0.5 Y Figure 4: Simulationresults at y = 0..0.5, repeats = 20. Large deviations were only noted for H > 50 (see the curves marked H < 150, H < 125, H < 100,H < 75 in Figure 4). Returningto Figure 2, note that by H > 50, the odds haveeither risen to their plateau or fallen awayto be vanishingly small. That is, increasing the variances (by changing y) only effects our results in uninteresting regions. Secondly, the above runs used Equation 9, which made the optimistic assumptionthat 0.7 i i i I I I 4 6 lambda 8 0.6 0.5 fl: OddsRH = maximum( OddsRaH~d, OddsR~) _e E "R Whathappensif we reason pessimistically, i.e. OddsRH = minimum(OddsR~ "d, ’ ~ \ \ 70 p 60’ 50 P 0.8 z=0.55 ’ ’ ’ 0.4 0.3 0.2 0.1 0 OddsR~) -0.1 0 To study this, a parameterA was introduced: .... ¢ ¢ I 2 ¢ 10 ¯ A = 0 implies Equation 9 uses minimums; Figure 5: Combiningvia minimization (left) via maximization(right). ¯ A = 10 implies Equation 9 uses maximums; ¯ 1 < A < 9 moves Equation 9 on a linear sliding scale between minimumand maximum. 28 to combining Acknowledgements The run of Figure 2 was repeated for IV] = 554 and constraints(G) = 2. The results (see Figure 5) showthat minimization, or even averaging (A = 5), is inappropriate. Only when we nearly maximize (A > 7) do we get any behaviour at all (i.e. maximum OddsRHrises above zero). Thirdly, we could refute Figure 2 by showingthat it does not accurately reflect the behaviour of knownsearch engines. Twopredictions can be generated from Figure 2: ¯ As z is altered, the odds of reaching solutions switches suddenly from very low to very high (this was observed at the z = 1.25 point). ¯ Recalling the discussion aroundFigure 3, Figure 2 is saying that a small numberof probes should reach as many nodes as a large numberof probes. This work was partially supported by NASA through cooperative agreement #NCC2-979. Discussions with Goran Trajkovski promptedthe A study. References Cheeseman,P.; Kanefsky,B.; and Taylor, W.1991. Wherethe really hardproblemsare. In ProceedingsoflJCAI-91,331-337. Coiera, E. 1996. Communication in organisations. Technical report, Hewlett-Packard Laboratories.TechnicalReportHPL-9665, May,1996. Coiera, E. 1999. Communication under scarcity of resources. Technicalreport, Hewlett-Packard Laboratories.TechnicalReport, 1999,to appear. Hamlet,D., and Taylor, R. 1990.Partition testing doesnot inspire confidence. IEEETransactionson Software Engineering 16(12):1402-1411. J Clarke, S.B. 1991. Groundingin communication.In Perspectives on Socially SharedCognition.American Psychological Association. Lowrey,M.; Boyd,M.; and Kulkami,D. 1998. Towardsa theory for integrationof mathematical verificationandempiricaltesting. In Proceedings,ASE’98:AutomatedSoftwareEngineering,322331. McCarthy,J. C., and Monk,A. E 1994. Channels, conversation, co-operationand relevance:all youwantedto knowabout communication but wereafraid to ask. CollaborativeComputing 1:35-60. Menzies,T., and Compton, P. 1997. Applicationsof abduction: Hypothesistesting of neuroendocrinological qualitative compartmentalmodels.Artificial Intelligence in Medicine10:145-175. Available fromhttp: //www. cse.unsw.edu.au/~timm/ pub/docs / 9 6aim.ps.gz. Menzies,T., andCukic,B. 1999.Intelligent testing canbe very lazy. NASA/WVU IV&V tech report. Submittedto ISE’99. Menzies,T. 1999.Simpler,faster abductivevalidation. AAAI-99 (submitted). Smythe,G. 1989. Brain-hypothalmus,Pituitary and the Endocrine Pancreas.TheEndocrinePancreas, Both these behaviourshas been noted in the literature: ¯ Suddenphase transitions in solvability: In manyNP-hard problems, it has been noted that there exist very narrow regions around which problems switch from being easily solvable to easily unsolvable (Cheeseman,Kanefsky, Taylor 1991). ¯ A small numberof probes suffices: HT4is a multi-world reasoner that searches for worlds that contain the most knownbehaviour of a system. HT4explores all assumptions that could potentially lead to desired goals. HT0is a cut-downversion of HT4which, a small numberof times, finds one randomworld. In millions of runs over tens of thousands of theories, HT0reached 98%of the desired goals as HT4(Menzies 1999). Dozens of analogous results showingthat a small numberof probes suffice have been reported in the software engineering and knowledge engineering literature (Menzies&Cukic 1999). Discussion If we makesome assumptions about the internal structure of an agent, we can generate estimates of howmanyprobes are required to test if that structure conformsto an external specification. In the case of and-or graphs, the total number of probes is either very large or very small depending on certain design choices (average numberof and-node parents and or-node parents). A partial list follows of language features that generate theory dependencygraphs with manyor-nodes: ¯ Disjunctions. ¯ Any indeterminacy, such as a conditional using a randomly generated numberand a threshold comparison. ¯ Polymorphism: one "or" would be generated for each type in the system that is accessed by this polymorphic operator. ¯ Calls to methods implemented and over-ridden many times in a hierarchy: one "or" wouldbe generatedfor each possible received of the message. A clear research direction from this work is the further definition of language features that simplify the time required to check consensus amongstagents. 29