Lectures TDDD10 AI Programming

Lectures AIProgramming:Introduction IntroductiontoRoboRescue 3 AgentsandAgentsArchitecture 4 Multi-AgentandCommunication 5 Multi-AgentDecisionMaking 6 CooperationAndCoordination1 7 CooperationAndCoordination2 8 Machine 9 KnowledgeRepresentation 10 PuttingItAll 1 TDDD10AIProgramming CooperationAndCoordination1 CyrilleBerger 2 2/78 Lecturegoals Lecturecontent CooperativeSensing&Exploration CooperativeStateEstimation ExtractingPredicates RobotExploration Coalitions&Roles DynamicRoleAssignment CoalitionFormation 3/78 4/78 CooperativeSensing&Exploration CooperativeStateEstimation WhyStateEstimation? ModelingSensornoise(1/2) Robotsneedtobeawareof theircurrentstateinorder toperformmeaningful actions! WhereamIlocatedinthe world? Wherearethevictims? 7/78 Sensorsarerepresentedbyaprobabilistic sensormodelp(z|x) Answersthequestion:Whatisthe probabilityformeasuringzwhengivenI amlocatedinstatex? Example:LaserScannerlocated1mfrom thewallreturnsinaverageevery10th time1.20m TypicallyrepresentedbyaGaussian 8/78 ModelingSensornoise(2/2) Datafromsensorsisnoisy,.e.g.,the distancemeasurementofalaseratone metercanbe1m±1cm Sensornoiseistypicallymodeledbya normaldistribution Fullydescribedbymeanμandvariance σ² Gaussians(1/2) Univariate: Multivariate: 9/78 Gaussians(2/2) 10/78 StateEstimation 1D ContinuesIntegrationofSensordata accordingtoprobabilitydistributions Sensorobservationsaretakeindifferent coordinateframes,e.g.,camera,laser Transformationof StateEstimationistheprocessof integratingmultipleobservationsto estimateastate 2D i.e.robotlocation,locationsof 11/78 12/78 ExampleofBayesianStateEstimation Supposearobotobtains measurementz WhatisP(open|z)? CausalvsDiagnosticReasoning P(open|z)is P(z|open)iscausal,i.e.,the sensormodel Oftencausalknowledgeiseasier toobtain Bayesruleallowsustouse causalknowledge: 13/78 Example 14/78 GeneralFramework:RecursiveBayesianFiltering zₜ:Sensorobservationattimet xₜ:Stateattimet P(z|open)=0.6P(z|¬open)=0.3 P(open)=P(¬open)=0.5 Likelihood(sensormodel) Prior Transition(ormotion)model Marginalization:computemarginalprobability zraisestheprobabilityofthebeliefthatthedoorisopen 15/78 16/78 AlgorithmsforBayesianFiltering KalmanFilter:optimalforlinear systemsandnormaldistributions,very efficient,uni-modal MonteCarloLocalization(Particle Filter):goodforanydistribution,can becomputationallyexpensive,multimodal 17/78 ImportanceofStateestimation KalmanfiltervsSimpleAveraging Triangulation Kalmanfilteringcompared toSimpleAveraging:Highly ConfidentEstimatesare moreStronglyWeighted Kalmanfiltering Simpleaveraging 18/78 MonteCarloLocalization(MCL)asObservationFilter TheKalman-Filtercanonlyhandlea singlehypotheses However,colorthresholdingonasoccerfieldmight confuseforexample“redt-shirts”withtheball Consequently,Kalmanfilteringyieldspoorresults MCL:Simultaneoustrackingof multiplehypotheses Canbeusedtofilter-outhypothesesweaklysupported byobservationsovertime 19/78 20/78 MarkovLocalization MonteCarloLocalization(MCL) Goal:approachfordealingwith arbitrarydistributions 21/78 KeyIdea:Samples 22/78 ParticleSet Usemultiplesamplestorepresent arbitrarydistributions Setofweightedsamples x[i]state w[i]importanceweight Thesamplesrepresenttheposterior 23/78 24/78 ParticlesforApproximation ParticleFilter RecursiveBayesfilter: Particlesforfunctionapproximation Themoreparticlesfallintoaninterval,the higheritsprobabilitydensity.Howtoobtainsuch samples? 25/78 ParticleFilterAlgorithm Non-parametricapproach Modelsthedistributionbysamples Prediction:drawfromtheproposal Correction:weightingbytheratiooftargetand proposal Themoresamplesweuse,thebetteristhe estimate! 26/78 ParticleFilterAlgorithm Sampletheparticlesusingthe proposaldistribution 1 Computetheimportance 2 Resampling:“Replaceunlikelysamples bymorelikelyones” 3 27/78 28/78 MonteCarloLocalization ParticleFilterforLocalization Eachparticleisaposehypothesis Proposalisthemotionmodel Correctionviatheobservationmodel 29/78 ParticleFilterforLocalization 30/78 Resampling Neededaswehavealimitednumber ofsamples Survivalofthefittest:“Replace unlikelysamplesbymorelikelyones” “Trick”toavoidthatmany samplescoverunlikelystates 31/78 32/78 Resampling Roulettewheel Binary O(nlog PhantomBalls:DevelopmentofProbabilityDistribution Stochastic universalsampling Low O(n) 33/78 Firstobservation Secondobservation Thirdobservation 34/78 PhantomBalls:DevelopmentofProbabilityDistribution Fourthobservation Fifthobservation Sixthobservation 35/78 ExtractingPredicates Case-Study:Extractingpredicatesforplayingsoccer Case-Study:Extractingpredicatesforplayingsoccer Predicatesareneededforsymbolic Extendedpredicates: Simplepredicatesofobjects(canbe directlycomputedfrompositions): Examples: Predicatesarethebasisforactionselectionandstrategic decisionmaking Canbeconsideredasworldmodelabstractions InOpponentsGoal(object),Objectinopponentgoal? InOwnGoal(object),Objectinowngoal? CloseToBorder(object),Thedistancetoanyborderisbeyonda threshold? FrontClear(),Neitheranotherobjectnortheborderisinfront? InDefense(object),Objectinthelastthirdofthesoccerfield? Computedbynormalizedgrids: (fi:ℜxℜ⇒[0..1]) Discretizedintocells,e,g., 10x10cmsize ffree:indicatespositionsunder theinfluenceoftheopponent fcovered:indicatesposition coveredbyteammates fdesired:indicatestacticalgood positions 37/78 38/78 RobotExploration RobotExploration Ateamofrobotshastoexplorean initiallyunknownenvironmentbysensor coverage Findanassignmentsoftargetlocationsto robotsthatminimizestheoverall explorationtime Variants Centralizedcoordinationviaworldmodeldataexchange Centralizedcoordinationwithassignmentoptimization Decentralizedcoordinationbypeer-to-peer communication 40/78 FrontierExploration Robotsfuseandsharetheirlocalmaps Thefrontiersbetweenfreespaceandunknown areasarepotentialtargetlocations Findagoodassignmentoffrontierlocationsto robotstominimizeoverallexplorationtime 41/78 Example:NeedforExplicitCoordination Levelofcoordination Noexchangeofinformation Implicitcoordination:Sharingajointmap Communicationandfusionoflocalmaps Centralmappingsystem FrontierExploration(Yamauchietal.,98) Explicitcoordination:Determinebettertarget locationstodistributetherobots CombinatorialProblem:“planner”forrobottargetassignment 42/78 ExplicitCoordination Choosetargetlocationsatthefrontierto theunexploredareabytradingoffthe expectedinformationgainandtravel costs Reduceutilityoftargetlocations whenevertheyareexpectedtobe coveredbythesensorsofanotherrobot Usecooperativesensingakadistributed stateestimationtocomputethejoint map 43/78 44/78 TheCoordinationAlgorithm ExampleRevised Determinethesetoffrontier ComputeforeachrobotithecostVⁱ(x,y)for reachingeachfrontiercell<x,y> 3 Settheutilityofallfrontiercellsto 4 Whilethereisonerobotleftwithouta 1 2 Determinearobotiandafrontiercell<x,y>which statisfies: (i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y')) Reducetheutilityofeachtargetpoint<x',y'>in thevisibilityareaofselected<x,y>accordingto: U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>)) 45/78 TypicalTrajectories 46/78 ExplorationTime Left:implicit Right:explicit 47/78 48/78 Drawbacks TheassignmentconsideredsofarisaGreedyassignment: Coalitions&Roles Moreoptimalapproaches: HungarianMethod Computestheoptimalassignmentofjobstomachinesgivenafixedcostmatrix Marketeconomy-basedapproaches(Auctions) Robotstradewithtargets Computationalloadissharedbetweentherobots 49/78 DynamicRoleAssignment Amechanismtoefficientlycoordinateagents PredefinedRoles(e.g.Attacker,Defender, Role-specificbehaviorsselection DynamicRoleAssignment Assignment:MappingbetweenNrolesandM Canbeaccordingtothecontext(e.g.teamformation) Suitedfordynamicdomains(e.g.robot ExampleRobotSoccer Avoidswarmbehaviorandinference(e.g.neitherattackyourown teammatesnorgetintothewayofanattackingordefending robot) Taskdecompositionandtask(re-)allocation(e.g.theplayerclosest totheballshouldgototheball Dynamicrolechanges(e.g.Ifaplayerisblocked,anothershould takeover) CoordinatingJointexecution(e.g.passingthe 52/78 GeneralAlgorithm CaseStudy:CS-FreiburgSoccer Assumptions: Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst, followedbyrole2,etc. Eachagentcanbeassignedtoonlyonerole Theutilityuijreflectshowappropriateagentiisforrolejgiventhe context Eachplayercanhaveoneoffour goalie(fixed) specialhardwaresetup,thusunabletochangethis activeplayer(inchargeofdealingwiththeball) canapproachtheballorbringtheballforwardtowardsthe opponentgoal strategicplayer: forallagentsinparallel I:=∅;//Committedassignmentswithordering foreachrolej=1,…,N computeutilityui,j;//Ownpreferenceofagenti broadcastui,j;//Toallotheragents end; Waituntilallui,jarereceived//Fromalltheotheragents foreachrolej=1,…,N assignrolejtoagenti*=argmaxi∉I{ui,j}; I:=I∪{i*};//Addassignment end; end. maintainsapositionbackinitsown supporter:(supportseitheractiveor indefensiveplayitcomplementstheteam’sdefensive inoffensiveplayitpresentsitselftoreceiveapasscloseto theopponentsgoal 53/78 RoleUtilities Placement:eachrolehasa preferredlocation,whichdepends onthesituation: ballposition,positionofteammatesand opponents defensivesituationorattack computedbypotential Utilityuijforeachrole: “Negativeutility(costs)”forreachingthe preferredlocationoftherole Costsarecomputedfrompartialcostsfor distance(ud),turnangle(ut),objectson thepath(uo) Weightedsumtoensureutilities between0..1:Uij=wd*ud+wt*ut+wo*uo 54/78 DynamicRoleAssignment activerole Eachplayercomputesutilitiesuijandbroadcastsresults Grouputility: Considerallpossibleassignmentsandcomputethesummed utilityfromeachagents’individualutilityforitsassignedrole Taketheassignmentwiththehighestutilitysumassolution (undertheassumptionthateveryagentdoesso) Rolesarere-assignedonlywhen strategicrole supportrole 55/78 therolechangeissignificant,i.e.thenewutility>>old utility(hysteresisfactortoavoidoscillation) twoplayersagree(by Notethatagentsmightliesince“opinion”aboutglobal positioncandiffer(evenwithaglobalworldmodel) 56/78 ExampleforRoleSwitching(1/2) AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbya defenderandconsequentlyreplacedbyanunblockedplayer 57/78 ExampleforRoleSwitching(2/2) DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayer areswitchedacoupleoftimes 58/78 Failedball-passing CoalitionFormation Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbased onstandardplan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,then makethepass” 59/78 CoalitionFormation Necessarywhentasksaremoreefficientlysolvedbya specificcombinationofagentcapabilities E.g.Adisasterlocationrequiresambulanceandfire Assignmentofgroupstotasksisnecessarywhentasks cannotbeperformedbyasingleagent E.g.asinglefirebrigadecannotextinguishalarge Agroupofagentsiscalledacoalition Acoalitionstructureisapartitioningofthesetofagents intodisjointcoalitions Anagentparticipatesinonlyonecoalition Acoalitionmayconsistofonlyasingleagent Generally,coalitionsconsistofheterogeneousagents 61/78 FireBrigadeExample 63/78 ApplicationsforCoalitionFormation Ine-commerce,buyerscanformcoalitionstopurchasea productinbulkandtakeadvantageofpricediscounts (Tsvetovatetal.,2000) InRealTimeStrategy(RTS)gamesgroupsofheterogeneous agentscanjointlyattackbasesoftheopponent.Mixturesof agentshavetobeaccordingtothedefensestrategyofthe opponent Distributedvehicleroutingamongdeliverycompanieswith theirowndeliverytasksandvehicles(Sandholm1997) Wide-areasurveillancebyautonomoussensornetworks (Dang2006) InRescue,teamformationtosolveparticularsub-problems, e.g.largerrobotsdeploysmallerrobotsintoconfinedspaces 62/78 FireBrigadeExample 64/78 ThreeActivitiesinCoalitionFormation ProblemFormulation Coalitionstructuregeneration: AgroupofagentsS⊆Aiscalledacoalition, whereAdenotesthesetofallagentsandS≠∅ Solvingtheoptimizationproblemineachcoalition: Acoalitionstructure(CS)partitionsthesetof agentsintocoalitions ThevalueofeachcoalitionSisgivenbya functionvS Partitioningoftheagentsintoexhaustiveanddisjoint Insidethecoalitions,agentswillcoordinatetheiractivities,but agentswillnotcoordinatebetweencoalitions Thecoalitionofalltheagentsiscalledgrandcoalition Poolingthetasksandresourcesoftheagentsinthecoalitionandsolving thejointproblem Thecoalitionobjectivecouldbetomaximizethemonetaryvalue,orthe overallexpectedutility Dividingthevalueofthegeneratedsolution: Intheend,eachagentwillreceiveavalue(moneyorutility)asaresult ofparticipatinginthecoalition Insomeproblems,thecoalitionvaluetheagentshavetoshareis negative,beingasharedcost 65/78 Coalitionstructuregeneration Thevalueofacoalitionstructure isgivenby: Eachcoalitionvalueisindependentofnon-membersactions CS*isthesocialwelfaremaximizingcoalition structure 66/78 SpecialCoalitionValues Coalitionvaluesaresuper-additiveiffforevery pairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vT Ifcoalitionvaluesaresuper-additive,thenthecoalitionstructure containingthegrandcoalitiongivesthehighestvalue Agentscannotdoworsebyteamingup V(CS)=∑{S∊CS}US Thegoalistomaximizethe socialwelfareofasetofagentsA byfindingacoalitionstructure thatsatisfies: Thecoalitionvaluesaresub-additiveifffor everypairofdisjointcoalitionsS,T⊆A:vS∪T< Ifcoalitionvaluesaresub-additive,thenthecoalitionstructure vS+vT {{a}|a∈A}inwhichnoagentcooperatesgivesthehighest value IstheambulancerescuetaskintheRoboCupRescuedomain super-additive,sub-additive,ornoneofboth? CS*=argmax{CS∊Partitions(A)}V(CS) 67/78 68/78 Coalitionstructuregeneration Coalitiongraph Input:allpossiblecoalitionsandtheirvalues A={1,2,3,4} ForNagentsthenumberofpossiblecoalitionsis 2^N-1andthenumberofpossiblecoalition structuresisN^(N/2) 69/78 CoalitionStructureSearchTime NodesrepresentCoalitionStructures Arcsrepresenteithermerges(downwards)orsplits (upwards) 70/78 ApproximateSolutiontoStructureSearch Canweapproximatethesearchby visitingonlyasubsetofLnodes? ChooseasetL(asubsetofallcoalitionsof A)andpickthebestcoalitionseen: CSL*=argmax{CS∊L} Onerequirementistoguaranteethat thefoundcoalitionstructureiswithina worstcaseboundfromoptimal: k*V(CSL*)≥V(CSL) Tosearchthewholecoalitiongraphfortheoptimalcoalitionstructure isintractable(onlyfeasibleif|A|<15) 71/78 72/78 ApproximateSolution CoalitionStructureSearchAlgorithm Searchthebottomtwolevelsofthe coalitionstructuregraph 2 Continuewithbreadth-firstsearchfrom thetopofthegraphaslongasthereis timeleft,oruntiltheentiregraphhas beensearched 3 Returnthecoalitionstructurethathasthe highestwelfareamongthoseseensofar Ifthebottomtwolevelsofthegraph areconsideredthen: 1 k=|A| andthenumberedofnodessearchesis n=2^(|A|-1) itcanbeproventhatnoothersearch algorithmcandobetterboundK whilesearchingn=2^(|A|-1)orfewer 73/78 Casestudy:ResQFreiburgTaskAllocation NambulanceteamshavetorescueMciviliansafteran earthquake CiviliansarecharacterizedbyBuriedness,DamageandHitPoints Costsarethetimetorescueacivilian,composedofthe coalition’sjointmaxtraveltimetoreachthevictim,andthe timeneededfortherescue Theoverallutilityisthenumberofrescuedcivilians(theciviliansbroughttoa refuge) Weconsideredtheambulancerescuetaskassuper-additive Therescueoperationitselfissuper-additive 74/78 ProblemasSequenceAssignment AssignasequenceRoftasks(herevictims)tothegrand coalitionofagentsA(hereambulances) R=<r1,r2,…,rN>whereridenotesarescuetaskandithe positioninthesequence U(R)denotesthepredictedutility(thenumberof survivors)whenexecutingsequenceR Hence,theproblemisfindtheoptimalsequencefrom thesetofallpossiblesequences:R*=argmaxU(R) Enumeratingallpossiblesequencesisintractable(N!) Greedysolutions: Prefervictimsthatcanberescuedfast(small Preferurgentvictims(high Aretherepossiblysituationswheretheambulanceteam needstobesplit? 75/78 76/78 ResultsRoboCup2004 Summary CooperativesensingwithKalmanFilterandParticles Filter Makeuseofsensorinformation Coordinationtechniqueforexplorationofthe environment Dynamicroleassignmentisanefficientmethodfor teamcoordination Actionselectionandcoordinationareessentialwhen actingingroups Coalitionformationistheprocessoffindingthe“social welfare”coalitionstructureamongasetofagents 77/78 78/78

Lectures TDDD10 AI Programming

Related documents

Products

Support

Lectures TDDD10 AI Programming

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib