Lectures TDDD10 AI Programming

advertisement
Lectures
TDDD10AIProgramming
CooperationAndCoordination1
CyrilleBerger
1AIProgramming:Introduction
2IntroductiontoRoboRescue
3AgentsandAgentsArchitecture
4Multi-AgentandCommunication
5Multi-AgentDecisionMaking
6CooperationAndCoordination1
7CooperationAndCoordination2
8MachineLearning
9AutomatedPlanning
10PuttingItAllTogether
2/80
Lecturegoals
Lecturecontent
CooperativeSensing&Exploration
Acquireknowledgeonhow:
CooperativeStateEstimation
ExtractingPredicates
RobotExploration
agentscanfuseinformation
tomakeagentsworktogether
Coalitions&Roles
DynamicRoleAssignment
CoalitionFormation
Casestudy:ResQFreiburgTask
Allocation
3/80
4/80
CooperativeSensing&Exploration
CooperativeStateEstimation
WhyStateEstimation?
ModelingSensornoise(1/2)
Robotsneedtobeawareoftheir
currentstateinordertoperform
meaningfulactions!
Datafromsensorsisnoisy
Inaccuracyofanalog/digitalconversion
Lowsignal
Sensorsarerepresentedbyaprobabilisticsensor
modelp(z|x)
Answersthequestion:Whatistheprobabilityformeasuringzwhen
givenIamlocatedinstatex?
Example:LaserScannerlocated1mfromthewall
returnsinaverageevery10thtime1.20m
p(1.2|1)=0.1
p(1|1)=0.9
WhereamIlocatedintheworld?
Wherearethevictims?
7
8
ModelingSensornoise(2/2)
NormalDistribution(1/2)
Sensornoiseiscontinuous
Univariate:
thedistancemeasurementofalaseratonemetercan
be1m±1cm
Sensornoiseistypicallymodeledbya
normaldistribution(Gaussian)
Fullydescribedbymeanμandstandard
deviationσ(orvarianceσ²)
Multivariate:
9
NormalDistribution(2/2)
10
StateEstimation
ContinuousIntegrationofSensordata
accordingtoprobabilitydistributions
Sensorobservationsaretakenindifferent
coordinateframes,e.g.,camera,laser
1D
Transformationofmeasurements
2D
StateEstimationistheprocessofintegrating
multipleobservationstoestimateastate
i.e.robotlocation,locationsofvictims
11
12
ExampleofBayesianStateEstimation
CausalvsDiagnosticReasoning
P(open|z)isdiagnostic
P(z|open)iscausal,i.e.,thesensor
Oftencausalknowledgeiseasier
toobtain
Bayesruleallowsustouse
causalknowledge:
Supposearobotobtainsmeasurement
z
WhatisP(open|z)?
13
Example
14
GeneralFramework:RecursiveBayesianFiltering
zₜ:Sensorobservationattimet
xₜ:Stateattimet
Initialstate
P(open)=P(¬open)=0.5
Sensormodel
P(z|open)=0.6P(z|¬open)=0.3
Likelihood(sensormodel)
Prior
Transition(ormotion)model
Lawoftotalprobability:computemarginalprobabilityp(z)
zraisestheprobabilityofthebeliefthatthedoorisopen
15
16
AlgorithmsforBayesianFiltering
KalmanFilter:optimalforlinearsystems
andnormaldistributions,veryefficient,
uni-modal,verygoodforhigh-dimension
problems
MonteCarloLocalization(ParticleFilter):
goodforanydistribution,canbe
computationallyexpensive,multi-modal,
limitedtolow-dimensionproblems
KalmanfiltervsSimpleAveraging
Triangulation
Kalmanfilteringcomparedto
SimpleAveraging:Highly
ConfidentEstimatesaremore
StronglyWeighted
17
Kalmanfiltering
Simpleaveraging
18
MonteCarloLocalizationasObservationFilter
ImportanceofStateestimation
TheKalman-Filtercanonlyhandlea
singlehypotheses
However,colorthresholdingonasoccerfieldmight
confuseforexample“redt-shirts”withtheball
Consequently,Kalmanfilteringyieldspoorresults
MonteCarloLocalization:Simultaneous
trackingofmultiplehypotheses
Canbeusedtofilter-outhypothesesweakly
supportedbyobservationsovertime
19
20
MonteCarloLocalization
MonteCarloLocalization
Goal:approachfordealingwitharbitrary
distributions
21
KeyIdea:Samples
22
ParticleSet
Usemultiplesamplestorepresentarbitrary
distributions
Setofweightedsamples
x[i]statehypothesis
w[i]importanceweight
Thesamplesrepresenttheposterior
23
24
ParticlesforApproximation
Particlesforfunctionapproximation
ParticleFilter
RecursiveBayesfilter:
Non-parametricapproach
Modelsthedistributionbysamples
Prediction:drawfromtheproposal
Correction:weightingbytheratiooftargetand
proposal
Themoresamplesweuse,thebetteristheestimate!
Themoreparticlesfallintoaninterval,thehigherits
probabilitydensity.Howtoobtainsuchsamples?
25
ParticleFilterAlgorithm
26
MonteCarloLocalization
1Sampletheparticlesusingtheproposaldistribution
Eachparticleisaposehypothesis
Proposalisthemotionmodel
2Computetheimportanceweights
Correctionviatheobservationmodel
3Resampling:“Replaceunlikelysamplesbymore
likelyones”
27
28
ParticleFilterforLocalization
Resampling
Neededaswehavealimitednumberof
samples
Survivalofthefittest:“Replaceunlikely
samplesbymorelikelyones”
“Trick”toavoidthatmanysamples
coverunlikelystates
29
PhantomBalls:DevelopmentofProbabilityDistribution
Firstobservation
Secondobservation
Thirdobservation
30
PhantomBalls:DevelopmentofProbabilityDistribution
Fourthobservation
31
Fifthobservation
Sixthobservation
32
Symbolicreasoning
Predicatesareneededforsymbolicreasoning
1whiletruedo
2getnextperceptp;
3B:=brf(B,p);
4I:=deliberate(B);
5P:=plan(B,I);
6execute(P);
7endwhile
ExtractingPredicates
Predicatesarethebasisforactionselectionand
strategicdecisionmaking
Canbeconsideredasworldmodelabstractions
34
Case-Study:Extractingpredicatesforplayingsoccer
Case-Study:Extractingpredicatesforplayingsoccer
Extendedpredicates:
Simplepredicatesofobjects(canbe
directlycomputedfrompositions):
Computedbynormalizedgrids:
(fi:ℜxℜ⇒[0..1])
Discretizedintocells,e,g.,10x10cm
size
InOpponentsGoal(object),Objectinopponentgoal?
InOwnGoal(object),Objectinowngoal?
CloseToBorder(object),Thedistancetoanyborderis
beyondathreshold?
FrontClear(),Neitheranotherobjectnortheborderisin
front?
InDefense(object),Objectinthelastthirdofthesoccer
field?
Examples:
ffree:indicatespositionsunderthe
influenceoftheopponent
fcovered:indicatespositioncovered
byteammates
fdesired:indicatestacticalgood
positions
35
36
RobotExploration
Ateamofrobotshastoexploreaninitially
unknownenvironmentbysensorcoverage
Findanassignmentsoftargetlocationsto
robotsthatminimizestheoverall
explorationtime
Variants
RobotExploration
Centralizedcoordinationviaworldmodeldataexchange
Centralizedcoordinationwithassignmentoptimization
Decentralizedcoordinationbypeer-to-peercommunication
38
FrontierExploration
Levelofcoordination
Robotsfuseandsharetheirlocalmaps
Thefrontiersbetweenfreespaceand
unknownareasarepotentialtargetlocations
Noexchangeofinformation
Implicitcoordination:Sharingajointmap
Communicationandfusionoflocalmaps
Centralmappingsystem
FrontierExploration(Yamauchietal.,98)
Explicitcoordination:Determinebetter
targetlocationstodistributetherobots
CombinatorialProblem:“planner”forrobottargetassignment
Findagoodassignmentoffrontierlocations
torobotstominimizeoverallexplorationtime
39
40
Example:NeedforExplicitCoordination
ExplicitCoordination
Choosetargetlocationsatthefrontierto
theunexploredareabytradingoffthe
expectedinformationgainandtravelcosts
Reduceutilityoftargetlocationswhenever
theyareexpectedtobecoveredbythe
sensorsofanotherrobot
Usecooperativesensingakadistributed
stateestimationtocomputethejointmap
41
TheCoordinationAlgorithm
42
ExampleRevised
1Determinethesetoffrontiercells
2ComputeforeachrobotithecostVⁱ(x,y)for
reachingeachfrontiercell<x,y>
3Settheutilityofallfrontiercellsto1
4Whilethereisonerobotleftwithoutatarget
Determinearobotiandafrontiercell<x,y>which
(i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y'))
Reducetheutilityofeachtargetpoint<x',y'>inthe
visibilityareaofselected<x,y>accordingto:
U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>))
43
44
TypicalTrajectories
ExplorationTime
Left:implicitcoordination
Right:explicitcoordination
45
46
Drawbacks
TheassignmentconsideredsofarisaGreedyassignment:
Coalitions&Roles
Moreoptimalapproaches:
HungarianMethod
Computestheoptimalassignmentofjobstomachinesgivenafixedcostmatrix
Marketeconomy-basedapproaches(Auctions)
Robotstradewithtargets
Computationalloadissharedbetweentherobots
47
DynamicRoleAssignment
Amechanismtoefficientlycoordinateagents
PredefinedRoles(e.g.Attacker,Defender,...)
Role-specificbehaviorsselection
Assignment:MappingbetweenNrolesandM
DynamicRoleAssignment
Canbeaccordingtothecontext(e.g.teamformation)
Suitedfordynamicdomains(e.g.robotsoccer)
ExampleRobotSoccer
Avoidswarmbehaviorandinference(e.g.neitherattackyourown
teammatesnorgetintothewayofanattackingordefendingrobot)
Taskdecompositionandtask(re-)allocation(e.g.theplayerclosestto
theballshouldgototheball
Dynamicrolechanges(e.g.Ifaplayerisblocked,anothershouldtake
CoordinatingJointexecution(e.g.passingthe
50
CaseStudy:CS-FreiburgSoccer
GeneralAlgorithm
Assumptions:
Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst,
followedbyrole2,etc.
Eachagentcanbeassignedtoonlyonerole
Theutilityuijreflectshowappropriateagentiisforrolejgiventhe
context
forallagentsinparallel
I:=∅;//Committedassignmentswithordering
foreachrolej=1,…,N
computeutilityui,j;//Ownpreferenceofagenti
broadcastui,j;//Toallotheragents
end;
Waituntilallui,jarereceived//Fromalltheotheragents
foreachrolej=1,…,N
assignrolejtoagenti*=argmaxi∉I{ui,j};
I:=I∪{i*};//Addassignment
end;
end.
51
52
CaseStudy:CS-FreiburgSoccer
RoleUtilities
Placement:eachrolehasapreferred
location,whichdependsonthe
situation:
Eachplayercanhaveoneoffourroles:
goalie(fixed)
specialhardwaresetup,thusunabletochangethisrole
ballposition,positionofteammatesand
opponents
defensivesituationorattack
computedbypotentialfields
activeplayer(inchargeofdealingwiththeball)
canapproachtheballorbringtheballforwardtowardstheopponent
goal
activerole
Utilityuijforeachrole:
strategicplayer:(defender)
“Negativeutility(costs)”forreachingthepreferred
locationoftherole
Costsarecomputedfrompartialcostsfordistance
(ud),turnangle(ut),objectsonthepath(uo)
Weightedsumtoensureutilitiesbetween0..1:Uij=
wd*ud+wt*ut+wo*uo
maintainsapositionbackinitsownhalf
supporter:(supportseitheractiveorstrategic)
indefensiveplayitcomplementstheteam’sdefensiveformation
inoffensiveplayitpresentsitselftoreceiveapassclosetothe
opponentsgoal
strategicrole
supportrole
53
54
ExampleforRoleSwitching(1/2)
DynamicRoleAssignment
Eachplayercomputesutilitiesuijandbroadcastsresults
Grouputility:
Considerallpossibleassignmentsandcomputethesummedutilityfromeach
agents’individualutilityforitsassignedrole
Taketheassignmentwiththehighestutilitysumassolution(underthe
assumptionthateveryagentdoesso)
Rolesarere-assignedonlywhen
therolechangeissignificant,i.e.thenewutility>>oldutility(hysteresisfactor
toavoidoscillation)
twoplayersagree(bycommunication)
Notethatagentsmightbewrongsince“opinion”about
globalpositioncandiffer(evenwithaglobalworldmodel)
AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbyadefenderand
consequentlyreplacedbyanunblockedplayer
55
56
ExampleforRoleSwitching(2/2)
Failedball-passing
DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayerareswitcheda
coupleoftimes
Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbasedonstandard
plan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,thenmakethepass”
57
58
CoalitionFormation
Necessarywhentasksaremoreefficientlysolvedbyaspecific
combinationofagentcapabilities
E.g.Adisasterlocationrequiresambulanceandfire
CoalitionFormation
Assignmentofgroupstotasksisnecessarywhentaskscannot
beperformedbyasingleagent
E.g.asinglefirebrigadecannotextinguishalarge
Agroupofagentsiscalledacoalition
Acoalitionstructureisapartitioningofthesetofagentsinto
disjointcoalitions
Anagentparticipatesinonlyonecoalition
Acoalitionmayconsistofonlyasingleagent
Generally,coalitionsconsistofheterogeneousagents
60
ApplicationsforCoalitionFormation
FireBrigadeExample
Ine-commerce,buyerscanformcoalitionstopurchaseaproductin
bulkandtakeadvantageofpricediscounts(Tsvetovatetal.,2000)
InRealTimeStrategy(RTS)gamesgroupsofheterogeneousagents
canjointlyattackbasesoftheopponent.Mixturesofagentshaveto
beaccordingtothedefensestrategyoftheopponent
Distributedvehicleroutingamongdeliverycompanieswith
theirowndeliverytasksandvehicles(Sandholm1997)
Wide-areasurveillancebyautonomoussensornetworks(Dang
2006)
InRescue,teamformationtosolveparticularsub-problems,e.g.
largerrobotsdeploysmallerrobotsintoconfinedspaces
61
62
ThreeActivitiesinCoalitionFormation
FireBrigadeExample
Coalitionstructuregeneration:
Partitioningoftheagentsintoexhaustiveanddisjoint
Insidethecoalitions,agentswillcoordinatetheiractivities,butagentswill
notcoordinatebetweencoalitions
Solvingtheoptimizationproblemineachcoalition:
Poolingthetasksandresourcesoftheagentsinthecoalitionandsolvingthe
jointproblem
Thecoalitionobjectivecouldbetomaximizethemonetaryvalue,orthe
overallexpectedutility
Dividingthevalueofthegeneratedsolution:
Intheend,eachagentwillreceiveavalue(moneyorutility)asaresult
ofparticipatinginthecoalition
Insomeproblems,thecoalitionvaluetheagentshavetoshareisnegative,being
asharedcost
63
64
Coalitionstructuregeneration
ProblemFormulation
AgroupofagentsS⊆Aiscalledacoalition,
whereAdenotesthesetofallagentsandS≠∅
Thevalueofacoalitionstructureis
givenby:
Thecoalitionofalltheagentsiscalledgrandcoalition
V(CS)=∑{S∊CS}US
Acoalitionstructure(CS)partitionsthesetof
agentsintocoalitions
ThevalueofeachcoalitionSisgivenbyafunction
vS
Thegoalistomaximizethesocial
welfareofasetofagentsAbyfindinga
coalitionstructurethatsatisfies:
Eachcoalitionvalueisindependentofnon-membersactions
CS*=argmax{CS∊Partitions(A)}V(CS)
CS*isthesocialwelfaremaximizingcoalition
structure
65
66
Coalitionstructuregeneration
SpecialCoalitionValues
Coalitionvaluesaresuper-additiveiffforevery
pairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vT
Input:allpossiblecoalitionsandtheirvalues
A={1,2,3,4}
Ifcoalitionvaluesaresuper-additive,thenthecoalitionstructure
containingthegrandcoalitiongivesthehighestvalue
Agentscannotdoworsebyteamingup
Thecoalitionvaluesaresub-additiveiffforevery
pairofdisjointcoalitionsS,T⊆A:vS∪T<vS+vT
Ifcoalitionvaluesaresub-additive,thenthecoalitionstructure{{a}
|a∈A}inwhichnoagentcooperatesgivesthehighestvalue
IstheambulancerescuetaskintheRoboCupRescuedomainsuperadditive,sub-additive,ornoneofboth?
ForNagentsthenumberofpossiblecoalitionsis
2^N-1andthenumberofpossiblecoalition
structuresisN^(N/2)
67
68
CoalitionStructureSearchTime
Coalitiongraph
NodesrepresentCoalitionStructures
Arcsrepresenteithermerges(downwards)orsplits(upwards)
Tosearchthewholecoalitiongraphfortheoptimalcoalition
structureisintractable(onlyfeasibleif|A|<15)
69
ApproximateSolutiontoStructureSearch
70
ApproximateSolution
Canweapproximatethesearchbyvisiting
onlyasubsetofLnodes?
ChooseasetL(asubsetofallcoalitionsofA)
andpickthebestcoalitionseen:
Ifthebottomtwolevelsofthegraph
areconsideredthen:
k=|A|
andthenumberedofnodessearches
isn=2^(|A|-1)
itcanbeproventhatnoothersearch
algorithmcandobetteraboundK
whilesearchingn=2^(|A|-1)orfewer
CSL*=argmax{CS∊L}V(CS)
Onerequirementistoguaranteethatthe
foundcoalitionstructureiswithinaworst
caseboundfromoptimal:
k*V(CSL*)≥V(CSL)
71
72
CoalitionStructureSearchAlgorithm
1Searchthebottomtwolevelsofthe
coalitionstructuregraph
2Continuewithbreadth-firstsearchfromthe
topofthegraphaslongasthereistimeleft,
oruntiltheentiregraphhasbeensearched
Casestudy:ResQFreiburgTaskAllocation
3Returnthecoalitionstructurethathasthe
highestwelfareamongthoseseensofar
73
Casestudy:ResQFreiburgTaskAllocation
ProblemasSequenceAssignment
NambulanceteamshavetorescueMciviliansafteran
earthquake
Civiliansarecharacterizedby:
AssignasequenceRoftasks(herevictims)tothe
grandcoalitionofagentsA(hereambulances)
R=<r1,r2,…,rN>whereridenotesarescuetaskandithepositionin
thesequence
Buriedness:proportionaltotherequiredresource
Hit-Points:decreasetozero,whentheciviliandie
Damage:howmuchthehit-pointsdecrease
Costsarethetimetorescueacivilian,composedofthe
coalition’sjointmaxtraveltimetoreachthevictim,and
thetimeneededfortherescue
Theoverallutilityisthenumberofrescuedcivilians(the
civiliansbroughttoarefuge)
75
U(R)denotesthepredictedutility(thenumber
ofsurvivors)whenexecutingsequenceR
Hence,theproblemisfindtheoptimalsequence
fromthesetofallpossiblesequences:R*=argmax
Enumeratingallpossiblesequencesisintractable(N!)
76
Greedysolution
Implementation
Non-allocatedagents(e.g.police&firebrigades)continuously
searchunexploredlocationsandupdateinformation(e.g.buridness,
health)aboutknownvictims
Theambulancestation(agent)
Greedysolutions:
Prefervictimsthatcanberescuedfast(small
buridness)
Preferurgentvictims(highdamage)
predictsforeachknownvictimthelifetimeandcostsforrescue
simulatesrescuesequences,selectedbyageneticalgorithm,overthesetofknownvictims
Whenabettersequencehasbeenfound,therescuesequenceofagentsinthefieldisaltered
Lifetimeprediction
Learningofadecisiontreefortheclassificationofvictimsintowilldieandwillsurvive
AdaptiveBoosting(AdaBoost)fortheregressionlearningofthelifetimeprediction
(previouslyondatasets)
Calculationofconfidencevalueswithrespecttotheageofinformation(e.g.asolderthe
informationasmoreunreliabletheprediction)
77
ResultsRoboCup2004
78
Summary
CooperativesensingwithKalmanFilterandParticlesFilter
Makeuseofsensorinformation
Coordinationtechniqueforexplorationoftheenvironment
Dynamicroleassignmentisanefficientmethodforteam
coordination
Actionselectionandcoordinationareessentialwhenacting
ingroups
Coalitionformationistheprocessoffindingthe“social
welfare”coalitionstructureamongasetofagents
79
80/80
Download