Smaller, Faster Agent Dialogues via Conversational Probing

From: AAAI Technical Report WS-99-12. Compilation copyright © 1999, AAAI (www.aaai.org). All rights reserved.
Smaller, Faster Agent Dialogues via Conversational Probing
t
Tim Menzies~, Bojan Cukic ¶, Enrico
Coiera
tNASA/WVU
Software Research Lab, 100 University Drive, Fairmont, USA
¶Departmentof ComputingScience and Electrical Engineering, West Virginia University, Morgantown,WV
tDepartment of ComputerScience and Engineering, University of NewSouth Wales, Australia 2052
< tim@menzies,com, cukic@csee,wvu. edu, ewc@pobox,com>
While agent models are in conflict, we broaden the bandwidth between them. Whenmodels are in consensus, we
restrict the bandwidth by assuming shared knowledgebetween agents. As a result, bandwidth is conserved during
conversationuntil it is explicitly demanded
to reconcile the
world modelsof conversing agents. Timeis also conserved
by minimising unnecessary sharing of knowledgebetween
agents (Coiera 1999).
This "assume-consensus" assumption has another advantage. Consider agents with a learning component. Such
agents mayupdate their world model. If they wish to coordinate with other agents, then they wouldneed to reflect
on the beliefs of their fellows. In the worst case, this means
that each agent mustmaintain one belief set for every other
agent in its community.This belief set should include what
the other agent thinks about all the other agents. If that
second-level belief set includes the original agents beliefs,
an infinite regress mayoccur (e.g. "I think he thinks I think
that he thinks that I think that."). This "assume-consensus"
modeladdresses the infinite regress problem. Agentsin assumedconsensus only need to store their ownbeliefs and
one extra axiom;i.e. "if I believe that she believes what I
do, then mybeliefs equals her beliefs".
Toapply this approach,agents haveto continuallytest that
other agents hold their beliefs. Onemethodfor doing this
wouldbe to ask agents to dumptheir belief sets to each other.
Weconsider this approachimpractical for three reasons:
¯ It incurs the penalty of the infinite regress, discussed
above.
¯ Such belief-dumps could exhaust the available bandwidth.
¯ Agentsmaynot wish to give other full access to their internal beliefs. For example,security issues mayblock an
agent from one vendor accessing the beliefs of an agent
from another vendor.
Without access to internal structures, howcan one agent
assess the contents of another? Software engineering has
one answer to this question: black-box specification-based
probing (Hamlet & Taylor 1990; Lowrey, Boyd, & Kulkarni
1998). Agent-Acould log its ownbehaviour to generate
library of input-output pairs. In doing so, Agent-Ais using
itself as a specification of the expectedbehaviourof AgentB, assuming our two agents are in consensus. If Agent-A
Abstract
Studies of humanconversation suggest that agents whose
world modelsare in consensuscan workwell together using only very narrowbandwidths.The total bandwidthrequired betweenagents couldhencebe minimisedif wecould
recognise whenmodelconsensusbreaks down.At the breakdownpoint, the communicationpolicy could switch from
someusual-case low value to a temporaryhigh value while
the modelconflict is resolved. Toeffectively recognisethe
breakdown
point, we needtools that recognise modelconflicts withoutrequiring extensivebandwidth.Amathematical modelof probingand-orgraphssuggeststhat, for a large
rangeof interesting models,the numberof probesrequiredto
detect consensusbreakdown
is quite low.
Keywords:Agent communication,Negotiation protocols,
Negotiatingto maintaincoalitions, Facilitating the negotiation process
Agents running on distributed computers mayinteract under a variety of resourcerestrictions. Onerestriction is bandwidth. Every transmission also takes time since it must be
processed by both the sender and the receiver. Studies of humancommunication
suggests that failure to optimise the use
of available resources can result in inefficient communication within organisations and consequenttask errors (Coiera
1996). Howcan we minimize the use of those resources
for agents, and maximisethe likelihood that communication
tasks are successfully completed?
Consider two humanscommunicatingvia a channel with
a certain bandwidth. A robust finding from humanconversations is that restricting bandwidthhas little impacton the
effectiveness of the outcomesof manycollaborative problem
solving tasks (McCarthy& Monk1994). This occurs when
the communicatingindividuals have a high degree of shared
common
ground. A particularly striking finding is that humans assume they share commonworld views in conversation until they detect an error, and then attempt to probethe
cause of the error. Interestingly, humanconversation seems
to consist of both an explicit small set of exchangesdevoted
to testing or confirming understanding, as well as the primary exchange (J Clarke 1991).
Can we use these observations to minimizethe bandwidth
required between computational agents? Suppose agents
move through alternating stages of assumed model consensus, detection of modelconflict, and then modelrepair.
24
4603
.~
c=
"6
._~
.~
o
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
/ 1 failure in 100 .-e--.
,~ 1 failure in 1,000--~--.
,/ 1 failure in 10,000
j’= 1 failure In 1001000
-.x---"
/" lfailurein 1,000,000.-=,--"
10
100
1000
10000
# of tests
100000
le+06
la+07
le+08
Figure 1: C = 1 - (1 - a)N. Theoretically, 4603 probes are required to achieve a 99%chance of detecting moderately
infrequent items; i.e. ones that only occupywhich10--~-th of the model. (Hamlet&Taylor 1990).
sees those outputs whenAgent-Bis presented with those inputs, then Agent A could infer that Agent B has the same
beliefs as itself.
Unfortunately, black-box probes can be very bandwidth
expensive. Equation 1 showsa widely-usedstatistical model
connecting the numberof probes N to the confidence C that
we will find an error with frequencyof occurrence~.
iv
C = 1- (1 - a)
The simulation modelalso offers clear guidelines as to
which of these two behaviours will occur. That is, when
we design agents, the simulation modelcan indicate what
systemfeatures allow agents to quickly assess each other.
Twocaveats before continuing. This paper presents a
probabilistic modelof howagents of a particular form (an
and-or graph) can assess each other. Hence:
¯ It offers average case results whichwill be inaccurate in
certain circumstances. For mission-critical systems, our
average-caseanalysis is hence inappropriate.
¯ The psychological likelihood of this modelis questionable and we should not use these results to makestateI.
ments about howhumansshould co-ordinate
(I)
Equation
1 saysthatforprobes
to be 99%certain
of detecting
anomalies,
thenthenumber
ofblack-box
probes
required
is4.6times
divided
bythefrequency
ofthatanomaly
(seeFigure
I).Forexample,
tobe 99%certain
thatAgent-B
haslessthan1%difference
initsbeliefs
toAgent-A,
then
460probeswouldbe required.
Thisstatistical
modelis
A Model of Probing
henceverypessimistic
on thepossibility
ofAgent-A
accuRoughly
speaking,
probing is a process of finding a needle
rately
assessing
itsconsensus
withAgent-B,
without
using
in a haystack. Whatare the odds of finding somerandom
large
bandwidths
fortheprobing.
Thispaper
willargue
thatFigure
I isa gross
over-estimateneedle? To answer this question, we have committo some
model of a program. The following model applies to any
of thenumber
of probes.
Figure
I assumes
noknowledge
of
program which can be reduced to a directed and-or graph
theinternal
structure
of ouragents.
If wecommit
to some
betweenconcepts. For example, a propositional rule base is
viewofstructure,
thenestimates
canbe generated
forthe
oddsof finding
differences
between
Agent-A
andAgent-B. such an and-or graph wherethe primitive conceptsare (e.g.)
a=true or a=false.
Hence,
assuming
thatagents
areusually
inconsensus,
then
wecanreduce
inter-agent
bandwidth
asfollows:
Graphs
¯ Assume
consensus
andrestrict
bandwidth.
Werepresent a programa directed-cyclic graph G contain¯ Aspartofthenormal
inter-agent
dialogue,
Agent-A
occaing vertices {V/, Vj, ...} and edges E. G has roots roots(G)
sionally
drops
a probe
question
to Agent-B.
Ifthenumber
and leaves leaves(G). A vertex is one of two types:
ofprobes
required
isverysmall,
thenthese
process
will
¯ Orscan be believed if any their parents are believed or it
addlittle
totheoverall
bandwidth.
has been labeled an In vertex (see below). The average
¯ Ifconflicts
aredetected,
increase
thebandwidth
between
numberof parents for the ors is
Agent-A
andAgent-B
toallowfordiscussions.
Therestofthispaper
isstructured
asfollows.
First,
we
parentsor = IParents(°rs(G))l
(2)
Ios(G)
present
a simple
mathematical
model
of theoddsof reaching
somerandomnodein an and-orgraphfromsomeinputs.
Each or-node contradicts 0 or more other or-nodes, deSecondly,
we present
simulation
results
suggesting
thatthe
noted no(V/) = {Vj, Vk, ...}. The average size of the no
average
oddsoffinding
a random
nodehasonlyoneoftwo
sets for all V/is denoted constraints(G).
behaviours:
1Exception:in the special case wherehumans
are collaborating
¯ Theoddsquickly
asymptotes
toa highplateau
to test software,theseresults couldbe usedto optimisetheir testing
¯ Theoddsquickly
falltonearly
zero.
efforts.
25
¯ Andscan be believed if all their parents are believed. The
average numberof parents for the ands is
parentsand = Iparents(ands(G))l
lands(G)
Prediction 1 If and-nodes dominate G, then OddsRwill
decrease very rapidly. If or-nodes dominate G, then OddsR
2.
will increasevery rapidly
Also, from Equation5 and 8 it follows that:
(3)
Proofs
Prediction
The aim of our probing is to generate proofs across G to
someoutput goal Outi. Outi comesfrom a test suite. A test
suite is the database {< In1, Out1 >, ... < Inn, Outn >}.
Wealso assume that each Ini,Outi contains only orvertices. Each output Outi can generated 0 or more proofs
{Px, Py, ...}. Wemakeno other commenton the nature of
Outi: it maybe someundesirable state or somedesired goal.
In and Out are sets of vertices from G. A proofP C_ G is
a tree containing the vertices uses(Px) = {Vi, Vj, ...}.The
proof tree has:
¯ Exactly one leaf whichis an output; i.e.
The probability of believing somearbitrary Vi is somecombination of
¯ Theprobability that it is an and-vertexand it is believed.
¯ Theprobability that it is an or-vertex and it is believed.
For the moment,we will assumethat the combiningfunction
is the maximum
of the two probabilities (and experiment
withthis later). Thatis:
’~d
lands(G)
’
Ileaves(uses(Px))l= 1A
V~ E leaves(uses(Px)) h V~ E Out
no(Vold)
)
Hence:
Prediction 3 Weexpect OddsRHto be asymptotic to some
percentageof the ratio of ands~orsin G.
(4)
Odds of OddsOKH
Let OddsOKH
be the probability of the one world assumption; i.e. the odds that a newvertex at height H can be added
into the current proof of height H - 1 without contradicting
anythingelse in that proof.
At H = 1, we are adding a vertex into an empty proof;
hence, the chances of a new vertex contradicting existing
proof vertices is zero and:
Oddsof Finding Outi
Based on the above, we can compute OddsRH,the odds of
reaching some arbitrary output Vi E Out use a proof tree
of height H. The section assumes that the probability that
a new vertex does not conflict with the no sets already in
a proof with odds OddsOKH
(this probability is computed
later).
Anoutput in a tree of height H = 1 can only be believed
if it is also an input. Onlyor-vertices can be inputs, so:
r
OddsR~ r - IV-Ilnl
~ * OddsOK~
nd
OddsR~
= 0
OddsOK1 = 1
OddsOK~ind = 1
(11)
For H > 1, the odds of an or-vertex contradicting another
or-vertex on the proof is the complimentof the sum of the
numberof or-vertices on that proof times the frequency of
constraint violations:
Otherwise, for H > 1:
¯ If ~ is an and-vertex, then we believe it with the probability of believingall its parents; i.e.
OddsO K~nd
(7)
¯ If Vi is an or-vertex, then webelieve it withthe probability
of believinganyof its parents; i.e.
OddsR~ = (parentsor)
(10)
And-vertices have no no set. Hencefor all H:
(5)
(6)
nd= ( OddsRn-1)Parentsana
OddsRaH
~OddsR~l
The maximumvalue for OddsRH is when OddsR~nd or
nd is 1, i.e.
OddsR~
¯ 1 or more roots roots(uses(P,)) C_
¯ Height height(P,) being the largest pathway from the
leaf to the roots.
Whengrowing a proof, a new vertex is added to the set
of vertices already on that proof. The newvertex must not
contradict the vertices already in the proof; that is, the new
vertex Vnewmust not satisfy:
Votd e uses(Px) A Vnewe
20ddsRH c< ~2~
,
(12)
Prediction 4 Unless constraints(G) is a very large fraction oflVI,then OddsOK
will be very large.
* OddsRH_l * OddsOKff
(8)
From Equations 7 and 8, we make the following predictions:
2Notethat this prediction,andthe onesto follow,are notcertain
inferencessince feedbackfactors mayinfluenceour results.
26
IWl =
554.. 55400
ISnl = 4
constraints(G)
lands(G)l
=
=
2..40
0.6" IVl
parentsand
parentsor
=
=
normal_dist(#pands, apands)
normal-dist(#pors, apors)
#pands
----
2
#pots
=
----
Z $ #ands
=
=
=
=
Y * #pots
0.. 0.5
0.5..2.0
1..20
Iors(a)l = IVl- lands(a)l= 0.4
1
0.1
0.01
0.001
0.0001
le-05
1 e-06
IV1=554;constraints(G)=2
i
j¢~~=~=0.50
..t,
....
z=0.75..÷ ....
,)"/~
~,,~=
z=1.00.~ ....
.......................
z=1.25 -x--.
.~
" ~ z=1.50
,.. "[9
z=1.75--x--~]
Z=2.00 -e-~. ... "El
" ~
"t9
". ".
D
"Q
"~tk’.,
’[9
¯
~’N.
O’pand
s
apors
y
z
repeats
’~,
~ t,
Table 1: Simulationparameters
20
1
0.1
0.01
0.001
0.0001
le-05
40
H
IV1=554; constraints(G)=40
i
t~~e~=0.50
..e
....
" ¢’,,’~
~
Z=0.75
"’÷ ....
z=l.00-G ....
~_~
Z=1.25 -x--~=1.50
"-=’-’=
"~’=1.75
"e~,..._
"°..rq
Z=2.00--e--’...:~
"[9
Q.
"El
~. ¥’,,,
’...
"E].E]
Simulations
Table 1 shows parametersfor an and-or graph from the
real-world modelof neuroendocrinology(Smythe1989) explored previously by Menzies & Compton(Menzies
Compton
1997). Initial experimentswith our formulaeassumedy = 0, repeats = 1; i.e. the numberof parents
of and-nodesand or-nodeswasconstant. Theresults of the
y = 0, repeats = i run are shownin Figure2. Notethat for
a range of graphsizes and numberof constraints, the same
pattern emerges:
¯ Confirmingpredictions 1 and 2, whenor-nodes dominate
(above z = 1.25), the odds of reaching some random
nodes asymptotesto the percentage of or-nodes in the
graph (in these simulations, es]~ = 0.4). Whenandnodesdominate(below z = 1.25), the oddsof reaching
nodedropsdramatically.
¯ Confirmingprediction 3, OddsR1of the top two plots of
Figure 2 are 100 times larger than OddsRzof the bottom
two plots. This follows since the ratio of ~ decreased
by a factor of 100 whenIYlmovedfrom 554 to 55400
andIINIremainedconstant.
¯ Confirmingprediction 4, an order of magnitudechange
in constraints(G) had little effect on OddsRHsince
(constraints(G) = 40) << (IYl = 554).
¯ In the case of the asymptoticeffect, after a certain proof
depth, the odds of reaching a node are not improvedby
further searching. In these simulations, that point was
H = 10 (for small theories with IVI = 554) and H =
(for large theories with IVI = 55400).
In summarynodes are either very reachable or barely
reachable, dependingon the average numberof and-node
parents andthe average-number
of or-node parents. Hence,
if we design our agents with moreor-parents than andparents,these results suggestthat the agentswill be able to
check for consensusvery easily. Figure 3 is an expansion
of the far left-hand-side of Figure 1 andshowsthat (e.g.)
4 probes gives a 87%confidence of finding a node when
OddsRH= 0.4. 4 probes is a very small overhead to an
I e-06
20
1
0.1
0.01
0.001
0.0001
1 e-05
I e-06
1
40
H
IV1=55400;constraints(G)=2
i
,/’e-e:2~z=0.50
--e ....
/" 2(
.,,~=0.75--÷....
,~ 2(
d 2(
/2(
0.001
0.0001
le-05
z=1.75--,vc-z=2.00
: ;-" ;-" ;~ ~-"X XX X)( )(
"El
~."Eg
’. "El
’~’ "E]"E]’~
,
I
20
40
H
IV1=55400;constraints(G)=40
i
,~ ;/~===z=o.5o
--<,
....
/" 2(
.,,==0.75
.-+....
0.1
0.01
. ~z=l.OO
--[9....
z=1.25
z=1.5o-~-
,22(
d
/2(
/~2(
~¢ 2(
2(
V
. ~ z=l.00 "o ....
z=1.25--x--z=1.50
Z=1.75
Z=2.00--e--
’, ~:. X X X X X X X X X X
"El
¥ "~
;,".,, "E],.
1 e-06
20
y * ~tpand
s
40
H
Figure2: Simulationresults at y = 0, repeats = 1.
27
o=OddsR(H)
i
i
i
100 /
’
QoF
"
0.9
so ~
0.7
0
5
10
15
Number
of tests
H<I~o
----
\
~
’~
H<75
H<50
--e.-I-[<25 --
H<125
H<lOO
-8--
40
30
20
10
20
0
0.1 0.2 0.3 0.4 0.5
Y
z=1.05
,
,
,
~
~
H<150
H<125
H<100
H<75
H<50
~<25
Figure 3: C = 1--(1--OddsRH)N; 0.2 _< OddsRH< 0.8.
100
90 f
80
70
60
50
4O
3O
20
10
on-going conversation. On the other hand, if we makethe
internal logic of our agents very complex(more and-nodes
than or-nodes) then it will be very hard to monitor consensus
without requiring manyprobes and a very large bandwidth.
ExternalValidity
This section explores the generalizability of the above results.
There are at least three challenges to the aboveresults.
Firstly, they assumea simple modelof the graph being processed; i.e. uniform distributions of and-parents and orparents. Howsensitive are the above findings to changes
in the distributions? To explore this issue, the runs of Figure 2 were repeated several times with increasing variance
in the distributions on nodeparents. In terms of Table1, this
meant runs with repeats = 20, y = 0..0.5. The results of
these runs is shownas:
0.1 0.2 0.3 0.4 0.5
Y
z=1.55
100/
m
c
m
’
’
’
’
90 ~
80I
70
60
50
40
’
H<150
H<125--x-H<100--B-H<75
H<50--e.-H<25 --
20
10
¯ The standard error on the meanof OddsR...
¯ Expressed as a percentage of the mean(e.g. a 50%standard error means that our expectation of X varies from
0.5X to 1.5 ¯ X).
0
0.1 0.2 0.3 0.4 0.5
Y
Figure 4: Simulationresults at y = 0..0.5, repeats = 20.
Large deviations were only noted for H > 50 (see the curves
marked H < 150, H < 125, H < 100,H < 75 in Figure 4). Returningto Figure 2, note that by H > 50, the odds
haveeither risen to their plateau or fallen awayto be vanishingly small. That is, increasing the variances (by changing
y) only effects our results in uninteresting regions.
Secondly, the above runs used Equation 9, which made
the optimistic assumptionthat
0.7
i
i
i
I
I
I
4
6
lambda
8
0.6
0.5
fl:
OddsRH = maximum( OddsRaH~d, OddsR~)
_e
E
"R
Whathappensif we reason pessimistically, i.e.
OddsRH = minimum(OddsR~ "d,
’
~
\
\
70 p
60’
50 P
0.8
z=0.55
’
’
’
0.4
0.3
0.2
0.1
0
OddsR~)
-0.1
0
To study this, a parameterA was introduced:
.... ¢
¢
I
2
¢
10
¯ A = 0 implies Equation 9 uses minimums;
Figure 5: Combiningvia minimization (left)
via maximization(right).
¯ A = 10 implies Equation 9 uses maximums;
¯ 1 < A < 9 moves Equation 9 on a linear sliding scale
between minimumand maximum.
28
to combining
Acknowledgements
The run of Figure 2 was repeated for IV] = 554 and
constraints(G) = 2. The results (see Figure 5) showthat
minimization, or even averaging (A = 5), is inappropriate.
Only when we nearly maximize (A > 7) do we get any behaviour at all (i.e. maximum
OddsRHrises above zero).
Thirdly, we could refute Figure 2 by showingthat it does
not accurately reflect the behaviour of knownsearch engines. Twopredictions can be generated from Figure 2:
¯ As z is altered, the odds of reaching solutions switches
suddenly from very low to very high (this was observed
at the z = 1.25 point).
¯ Recalling the discussion aroundFigure 3, Figure 2 is saying that a small numberof probes should reach as many
nodes as a large numberof probes.
This work was partially supported by NASA
through cooperative agreement #NCC2-979. Discussions with Goran
Trajkovski promptedthe A study.
References
Cheeseman,P.; Kanefsky,B.; and Taylor, W.1991. Wherethe
really hardproblemsare. In ProceedingsoflJCAI-91,331-337.
Coiera, E. 1996. Communication
in organisations. Technical
report, Hewlett-Packard
Laboratories.TechnicalReportHPL-9665, May,1996.
Coiera, E. 1999. Communication
under scarcity of resources.
Technicalreport, Hewlett-Packard
Laboratories.TechnicalReport, 1999,to appear.
Hamlet,D., and Taylor, R. 1990.Partition testing doesnot inspire confidence. IEEETransactionson Software Engineering
16(12):1402-1411.
J Clarke, S.B. 1991. Groundingin communication.In Perspectives on Socially SharedCognition.American
Psychological
Association.
Lowrey,M.; Boyd,M.; and Kulkami,D. 1998. Towardsa theory
for integrationof mathematical
verificationandempiricaltesting.
In Proceedings,ASE’98:AutomatedSoftwareEngineering,322331.
McCarthy,J. C., and Monk,A. E 1994. Channels, conversation, co-operationand relevance:all youwantedto knowabout
communication
but wereafraid to ask. CollaborativeComputing
1:35-60.
Menzies,T., and Compton,
P. 1997. Applicationsof abduction:
Hypothesistesting of neuroendocrinological
qualitative compartmentalmodels.Artificial Intelligence in Medicine10:145-175.
Available
fromhttp: //www.
cse.unsw.edu.au/~timm/
pub/docs
/ 9 6aim.ps.gz.
Menzies,T., andCukic,B. 1999.Intelligent testing canbe very
lazy. NASA/WVU
IV&V
tech report. Submittedto ISE’99.
Menzies,T. 1999.Simpler,faster abductivevalidation. AAAI-99
(submitted).
Smythe,G. 1989. Brain-hypothalmus,Pituitary and the Endocrine Pancreas.TheEndocrinePancreas,
Both these behaviourshas been noted in the literature:
¯ Suddenphase transitions in solvability: In manyNP-hard
problems, it has been noted that there exist very narrow
regions around which problems switch from being easily
solvable to easily unsolvable (Cheeseman,Kanefsky,
Taylor 1991).
¯ A small numberof probes suffices: HT4is a multi-world
reasoner that searches for worlds that contain the most
knownbehaviour of a system. HT4explores all assumptions that could potentially lead to desired goals. HT0is a
cut-downversion of HT4which, a small numberof times,
finds one randomworld. In millions of runs over tens of
thousands of theories, HT0reached 98%of the desired
goals as HT4(Menzies 1999). Dozens of analogous results showingthat a small numberof probes suffice have
been reported in the software engineering and knowledge
engineering literature (Menzies&Cukic 1999).
Discussion
If we makesome assumptions about the internal structure
of an agent, we can generate estimates of howmanyprobes
are required to test if that structure conformsto an external
specification. In the case of and-or graphs, the total number
of probes is either very large or very small depending on
certain design choices (average numberof and-node parents
and or-node parents).
A partial list follows of language features that generate
theory dependencygraphs with manyor-nodes:
¯ Disjunctions.
¯ Any indeterminacy, such as a conditional using a randomly generated numberand a threshold comparison.
¯ Polymorphism: one "or" would be generated for each
type in the system that is accessed by this polymorphic
operator.
¯ Calls to methods implemented and over-ridden many
times in a hierarchy: one "or" wouldbe generatedfor each
possible received of the message.
A clear research direction from this work is the further
definition of language features that simplify the time required to check consensus amongstagents.
29