Lectures TDDD10AIProgramming CooperationAndCoordination1 CyrilleBerger 1AIProgramming:Introduction 2IntroductiontoRoboRescue 3AgentsandAgentsArchitecture 4Multi-AgentandCommunication 5Multi-AgentDecisionMaking 6CooperationAndCoordination1 7CooperationAndCoordination2 8MachineLearning 9AutomatedPlanning 10PuttingItAllTogether 2/80 Lecturegoals Lecturecontent CooperativeSensing&Exploration Acquireknowledgeonhow: CooperativeStateEstimation ExtractingPredicates RobotExploration agentscanfuseinformation tomakeagentsworktogether Coalitions&Roles DynamicRoleAssignment CoalitionFormation Casestudy:ResQFreiburgTask Allocation 3/80 4/80 CooperativeSensing&Exploration CooperativeStateEstimation WhyStateEstimation? ModelingSensornoise(1/2) Robotsneedtobeawareoftheir currentstateinordertoperform meaningfulactions! Datafromsensorsisnoisy Inaccuracyofanalog/digitalconversion Lowsignal Sensorsarerepresentedbyaprobabilisticsensor modelp(z|x) Answersthequestion:Whatistheprobabilityformeasuringzwhen givenIamlocatedinstatex? Example:LaserScannerlocated1mfromthewall returnsinaverageevery10thtime1.20m p(1.2|1)=0.1 p(1|1)=0.9 WhereamIlocatedintheworld? Wherearethevictims? 7 8 ModelingSensornoise(2/2) NormalDistribution(1/2) Sensornoiseiscontinuous Univariate: thedistancemeasurementofalaseratonemetercan be1m±1cm Sensornoiseistypicallymodeledbya normaldistribution(Gaussian) Fullydescribedbymeanμandstandard deviationσ(orvarianceσ²) Multivariate: 9 NormalDistribution(2/2) 10 StateEstimation ContinuousIntegrationofSensordata accordingtoprobabilitydistributions Sensorobservationsaretakenindifferent coordinateframes,e.g.,camera,laser 1D Transformationofmeasurements 2D StateEstimationistheprocessofintegrating multipleobservationstoestimateastate i.e.robotlocation,locationsofvictims 11 12 ExampleofBayesianStateEstimation CausalvsDiagnosticReasoning P(open|z)isdiagnostic P(z|open)iscausal,i.e.,thesensor Oftencausalknowledgeiseasier toobtain Bayesruleallowsustouse causalknowledge: Supposearobotobtainsmeasurement z WhatisP(open|z)? 13 Example 14 GeneralFramework:RecursiveBayesianFiltering zₜ:Sensorobservationattimet xₜ:Stateattimet Initialstate P(open)=P(¬open)=0.5 Sensormodel P(z|open)=0.6P(z|¬open)=0.3 Likelihood(sensormodel) Prior Transition(ormotion)model Lawoftotalprobability:computemarginalprobabilityp(z) zraisestheprobabilityofthebeliefthatthedoorisopen 15 16 AlgorithmsforBayesianFiltering KalmanFilter:optimalforlinearsystems andnormaldistributions,veryefficient, uni-modal,verygoodforhigh-dimension problems MonteCarloLocalization(ParticleFilter): goodforanydistribution,canbe computationallyexpensive,multi-modal, limitedtolow-dimensionproblems KalmanfiltervsSimpleAveraging Triangulation Kalmanfilteringcomparedto SimpleAveraging:Highly ConfidentEstimatesaremore StronglyWeighted 17 Kalmanfiltering Simpleaveraging 18 MonteCarloLocalizationasObservationFilter ImportanceofStateestimation TheKalman-Filtercanonlyhandlea singlehypotheses However,colorthresholdingonasoccerfieldmight confuseforexample“redt-shirts”withtheball Consequently,Kalmanfilteringyieldspoorresults MonteCarloLocalization:Simultaneous trackingofmultiplehypotheses Canbeusedtofilter-outhypothesesweakly supportedbyobservationsovertime 19 20 MonteCarloLocalization MonteCarloLocalization Goal:approachfordealingwitharbitrary distributions 21 KeyIdea:Samples 22 ParticleSet Usemultiplesamplestorepresentarbitrary distributions Setofweightedsamples x[i]statehypothesis w[i]importanceweight Thesamplesrepresenttheposterior 23 24 ParticlesforApproximation Particlesforfunctionapproximation ParticleFilter RecursiveBayesfilter: Non-parametricapproach Modelsthedistributionbysamples Prediction:drawfromtheproposal Correction:weightingbytheratiooftargetand proposal Themoresamplesweuse,thebetteristheestimate! Themoreparticlesfallintoaninterval,thehigherits probabilitydensity.Howtoobtainsuchsamples? 25 ParticleFilterAlgorithm 26 MonteCarloLocalization 1Sampletheparticlesusingtheproposaldistribution Eachparticleisaposehypothesis Proposalisthemotionmodel 2Computetheimportanceweights Correctionviatheobservationmodel 3Resampling:“Replaceunlikelysamplesbymore likelyones” 27 28 ParticleFilterforLocalization Resampling Neededaswehavealimitednumberof samples Survivalofthefittest:“Replaceunlikely samplesbymorelikelyones” “Trick”toavoidthatmanysamples coverunlikelystates 29 PhantomBalls:DevelopmentofProbabilityDistribution Firstobservation Secondobservation Thirdobservation 30 PhantomBalls:DevelopmentofProbabilityDistribution Fourthobservation 31 Fifthobservation Sixthobservation 32 Symbolicreasoning Predicatesareneededforsymbolicreasoning 1whiletruedo 2getnextperceptp; 3B:=brf(B,p); 4I:=deliberate(B); 5P:=plan(B,I); 6execute(P); 7endwhile ExtractingPredicates Predicatesarethebasisforactionselectionand strategicdecisionmaking Canbeconsideredasworldmodelabstractions 34 Case-Study:Extractingpredicatesforplayingsoccer Case-Study:Extractingpredicatesforplayingsoccer Extendedpredicates: Simplepredicatesofobjects(canbe directlycomputedfrompositions): Computedbynormalizedgrids: (fi:ℜxℜ⇒[0..1]) Discretizedintocells,e,g.,10x10cm size InOpponentsGoal(object),Objectinopponentgoal? InOwnGoal(object),Objectinowngoal? CloseToBorder(object),Thedistancetoanyborderis beyondathreshold? FrontClear(),Neitheranotherobjectnortheborderisin front? InDefense(object),Objectinthelastthirdofthesoccer field? Examples: ffree:indicatespositionsunderthe influenceoftheopponent fcovered:indicatespositioncovered byteammates fdesired:indicatestacticalgood positions 35 36 RobotExploration Ateamofrobotshastoexploreaninitially unknownenvironmentbysensorcoverage Findanassignmentsoftargetlocationsto robotsthatminimizestheoverall explorationtime Variants RobotExploration Centralizedcoordinationviaworldmodeldataexchange Centralizedcoordinationwithassignmentoptimization Decentralizedcoordinationbypeer-to-peercommunication 38 FrontierExploration Levelofcoordination Robotsfuseandsharetheirlocalmaps Thefrontiersbetweenfreespaceand unknownareasarepotentialtargetlocations Noexchangeofinformation Implicitcoordination:Sharingajointmap Communicationandfusionoflocalmaps Centralmappingsystem FrontierExploration(Yamauchietal.,98) Explicitcoordination:Determinebetter targetlocationstodistributetherobots CombinatorialProblem:“planner”forrobottargetassignment Findagoodassignmentoffrontierlocations torobotstominimizeoverallexplorationtime 39 40 Example:NeedforExplicitCoordination ExplicitCoordination Choosetargetlocationsatthefrontierto theunexploredareabytradingoffthe expectedinformationgainandtravelcosts Reduceutilityoftargetlocationswhenever theyareexpectedtobecoveredbythe sensorsofanotherrobot Usecooperativesensingakadistributed stateestimationtocomputethejointmap 41 TheCoordinationAlgorithm 42 ExampleRevised 1Determinethesetoffrontiercells 2ComputeforeachrobotithecostVⁱ(x,y)for reachingeachfrontiercell<x,y> 3Settheutilityofallfrontiercellsto1 4Whilethereisonerobotleftwithoutatarget Determinearobotiandafrontiercell<x,y>which (i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y')) Reducetheutilityofeachtargetpoint<x',y'>inthe visibilityareaofselected<x,y>accordingto: U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>)) 43 44 TypicalTrajectories ExplorationTime Left:implicitcoordination Right:explicitcoordination 45 46 Drawbacks TheassignmentconsideredsofarisaGreedyassignment: Coalitions&Roles Moreoptimalapproaches: HungarianMethod Computestheoptimalassignmentofjobstomachinesgivenafixedcostmatrix Marketeconomy-basedapproaches(Auctions) Robotstradewithtargets Computationalloadissharedbetweentherobots 47 DynamicRoleAssignment Amechanismtoefficientlycoordinateagents PredefinedRoles(e.g.Attacker,Defender,...) Role-specificbehaviorsselection Assignment:MappingbetweenNrolesandM DynamicRoleAssignment Canbeaccordingtothecontext(e.g.teamformation) Suitedfordynamicdomains(e.g.robotsoccer) ExampleRobotSoccer Avoidswarmbehaviorandinference(e.g.neitherattackyourown teammatesnorgetintothewayofanattackingordefendingrobot) Taskdecompositionandtask(re-)allocation(e.g.theplayerclosestto theballshouldgototheball Dynamicrolechanges(e.g.Ifaplayerisblocked,anothershouldtake CoordinatingJointexecution(e.g.passingthe 50 CaseStudy:CS-FreiburgSoccer GeneralAlgorithm Assumptions: Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst, followedbyrole2,etc. Eachagentcanbeassignedtoonlyonerole Theutilityuijreflectshowappropriateagentiisforrolejgiventhe context forallagentsinparallel I:=∅;//Committedassignmentswithordering foreachrolej=1,…,N computeutilityui,j;//Ownpreferenceofagenti broadcastui,j;//Toallotheragents end; Waituntilallui,jarereceived//Fromalltheotheragents foreachrolej=1,…,N assignrolejtoagenti*=argmaxi∉I{ui,j}; I:=I∪{i*};//Addassignment end; end. 51 52 CaseStudy:CS-FreiburgSoccer RoleUtilities Placement:eachrolehasapreferred location,whichdependsonthe situation: Eachplayercanhaveoneoffourroles: goalie(fixed) specialhardwaresetup,thusunabletochangethisrole ballposition,positionofteammatesand opponents defensivesituationorattack computedbypotentialfields activeplayer(inchargeofdealingwiththeball) canapproachtheballorbringtheballforwardtowardstheopponent goal activerole Utilityuijforeachrole: strategicplayer:(defender) “Negativeutility(costs)”forreachingthepreferred locationoftherole Costsarecomputedfrompartialcostsfordistance (ud),turnangle(ut),objectsonthepath(uo) Weightedsumtoensureutilitiesbetween0..1:Uij= wd*ud+wt*ut+wo*uo maintainsapositionbackinitsownhalf supporter:(supportseitheractiveorstrategic) indefensiveplayitcomplementstheteam’sdefensiveformation inoffensiveplayitpresentsitselftoreceiveapassclosetothe opponentsgoal strategicrole supportrole 53 54 ExampleforRoleSwitching(1/2) DynamicRoleAssignment Eachplayercomputesutilitiesuijandbroadcastsresults Grouputility: Considerallpossibleassignmentsandcomputethesummedutilityfromeach agents’individualutilityforitsassignedrole Taketheassignmentwiththehighestutilitysumassolution(underthe assumptionthateveryagentdoesso) Rolesarere-assignedonlywhen therolechangeissignificant,i.e.thenewutility>>oldutility(hysteresisfactor toavoidoscillation) twoplayersagree(bycommunication) Notethatagentsmightbewrongsince“opinion”about globalpositioncandiffer(evenwithaglobalworldmodel) AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbyadefenderand consequentlyreplacedbyanunblockedplayer 55 56 ExampleforRoleSwitching(2/2) Failedball-passing DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayerareswitcheda coupleoftimes Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbasedonstandard plan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,thenmakethepass” 57 58 CoalitionFormation Necessarywhentasksaremoreefficientlysolvedbyaspecific combinationofagentcapabilities E.g.Adisasterlocationrequiresambulanceandfire CoalitionFormation Assignmentofgroupstotasksisnecessarywhentaskscannot beperformedbyasingleagent E.g.asinglefirebrigadecannotextinguishalarge Agroupofagentsiscalledacoalition Acoalitionstructureisapartitioningofthesetofagentsinto disjointcoalitions Anagentparticipatesinonlyonecoalition Acoalitionmayconsistofonlyasingleagent Generally,coalitionsconsistofheterogeneousagents 60 ApplicationsforCoalitionFormation FireBrigadeExample Ine-commerce,buyerscanformcoalitionstopurchaseaproductin bulkandtakeadvantageofpricediscounts(Tsvetovatetal.,2000) InRealTimeStrategy(RTS)gamesgroupsofheterogeneousagents canjointlyattackbasesoftheopponent.Mixturesofagentshaveto beaccordingtothedefensestrategyoftheopponent Distributedvehicleroutingamongdeliverycompanieswith theirowndeliverytasksandvehicles(Sandholm1997) Wide-areasurveillancebyautonomoussensornetworks(Dang 2006) InRescue,teamformationtosolveparticularsub-problems,e.g. largerrobotsdeploysmallerrobotsintoconfinedspaces 61 62 ThreeActivitiesinCoalitionFormation FireBrigadeExample Coalitionstructuregeneration: Partitioningoftheagentsintoexhaustiveanddisjoint Insidethecoalitions,agentswillcoordinatetheiractivities,butagentswill notcoordinatebetweencoalitions Solvingtheoptimizationproblemineachcoalition: Poolingthetasksandresourcesoftheagentsinthecoalitionandsolvingthe jointproblem Thecoalitionobjectivecouldbetomaximizethemonetaryvalue,orthe overallexpectedutility Dividingthevalueofthegeneratedsolution: Intheend,eachagentwillreceiveavalue(moneyorutility)asaresult ofparticipatinginthecoalition Insomeproblems,thecoalitionvaluetheagentshavetoshareisnegative,being asharedcost 63 64 Coalitionstructuregeneration ProblemFormulation AgroupofagentsS⊆Aiscalledacoalition, whereAdenotesthesetofallagentsandS≠∅ Thevalueofacoalitionstructureis givenby: Thecoalitionofalltheagentsiscalledgrandcoalition V(CS)=∑{S∊CS}US Acoalitionstructure(CS)partitionsthesetof agentsintocoalitions ThevalueofeachcoalitionSisgivenbyafunction vS Thegoalistomaximizethesocial welfareofasetofagentsAbyfindinga coalitionstructurethatsatisfies: Eachcoalitionvalueisindependentofnon-membersactions CS*=argmax{CS∊Partitions(A)}V(CS) CS*isthesocialwelfaremaximizingcoalition structure 65 66 Coalitionstructuregeneration SpecialCoalitionValues Coalitionvaluesaresuper-additiveiffforevery pairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vT Input:allpossiblecoalitionsandtheirvalues A={1,2,3,4} Ifcoalitionvaluesaresuper-additive,thenthecoalitionstructure containingthegrandcoalitiongivesthehighestvalue Agentscannotdoworsebyteamingup Thecoalitionvaluesaresub-additiveiffforevery pairofdisjointcoalitionsS,T⊆A:vS∪T<vS+vT Ifcoalitionvaluesaresub-additive,thenthecoalitionstructure{{a} |a∈A}inwhichnoagentcooperatesgivesthehighestvalue IstheambulancerescuetaskintheRoboCupRescuedomainsuperadditive,sub-additive,ornoneofboth? ForNagentsthenumberofpossiblecoalitionsis 2^N-1andthenumberofpossiblecoalition structuresisN^(N/2) 67 68 CoalitionStructureSearchTime Coalitiongraph NodesrepresentCoalitionStructures Arcsrepresenteithermerges(downwards)orsplits(upwards) Tosearchthewholecoalitiongraphfortheoptimalcoalition structureisintractable(onlyfeasibleif|A|<15) 69 ApproximateSolutiontoStructureSearch 70 ApproximateSolution Canweapproximatethesearchbyvisiting onlyasubsetofLnodes? ChooseasetL(asubsetofallcoalitionsofA) andpickthebestcoalitionseen: Ifthebottomtwolevelsofthegraph areconsideredthen: k=|A| andthenumberedofnodessearches isn=2^(|A|-1) itcanbeproventhatnoothersearch algorithmcandobetteraboundK whilesearchingn=2^(|A|-1)orfewer CSL*=argmax{CS∊L}V(CS) Onerequirementistoguaranteethatthe foundcoalitionstructureiswithinaworst caseboundfromoptimal: k*V(CSL*)≥V(CSL) 71 72 CoalitionStructureSearchAlgorithm 1Searchthebottomtwolevelsofthe coalitionstructuregraph 2Continuewithbreadth-firstsearchfromthe topofthegraphaslongasthereistimeleft, oruntiltheentiregraphhasbeensearched Casestudy:ResQFreiburgTaskAllocation 3Returnthecoalitionstructurethathasthe highestwelfareamongthoseseensofar 73 Casestudy:ResQFreiburgTaskAllocation ProblemasSequenceAssignment NambulanceteamshavetorescueMciviliansafteran earthquake Civiliansarecharacterizedby: AssignasequenceRoftasks(herevictims)tothe grandcoalitionofagentsA(hereambulances) R=<r1,r2,…,rN>whereridenotesarescuetaskandithepositionin thesequence Buriedness:proportionaltotherequiredresource Hit-Points:decreasetozero,whentheciviliandie Damage:howmuchthehit-pointsdecrease Costsarethetimetorescueacivilian,composedofthe coalition’sjointmaxtraveltimetoreachthevictim,and thetimeneededfortherescue Theoverallutilityisthenumberofrescuedcivilians(the civiliansbroughttoarefuge) 75 U(R)denotesthepredictedutility(thenumber ofsurvivors)whenexecutingsequenceR Hence,theproblemisfindtheoptimalsequence fromthesetofallpossiblesequences:R*=argmax Enumeratingallpossiblesequencesisintractable(N!) 76 Greedysolution Implementation Non-allocatedagents(e.g.police&firebrigades)continuously searchunexploredlocationsandupdateinformation(e.g.buridness, health)aboutknownvictims Theambulancestation(agent) Greedysolutions: Prefervictimsthatcanberescuedfast(small buridness) Preferurgentvictims(highdamage) predictsforeachknownvictimthelifetimeandcostsforrescue simulatesrescuesequences,selectedbyageneticalgorithm,overthesetofknownvictims Whenabettersequencehasbeenfound,therescuesequenceofagentsinthefieldisaltered Lifetimeprediction Learningofadecisiontreefortheclassificationofvictimsintowilldieandwillsurvive AdaptiveBoosting(AdaBoost)fortheregressionlearningofthelifetimeprediction (previouslyondatasets) Calculationofconfidencevalueswithrespecttotheageofinformation(e.g.asolderthe informationasmoreunreliabletheprediction) 77 ResultsRoboCup2004 78 Summary CooperativesensingwithKalmanFilterandParticlesFilter Makeuseofsensorinformation Coordinationtechniqueforexplorationoftheenvironment Dynamicroleassignmentisanefficientmethodforteam coordination Actionselectionandcoordinationareessentialwhenacting ingroups Coalitionformationistheprocessoffindingthe“social welfare”coalitionstructureamongasetofagents 79 80/80