Lectures TDDD10 AI Programming

advertisement
Lectures
AIProgramming:Introduction
IntroductiontoRoboRescue
3
AgentsandAgentsArchitecture
4
Multi-AgentandCommunication
5
Multi-AgentDecisionMaking
6
CooperationAndCoordination1
7
CooperationAndCoordination2
8
Machine
9
KnowledgeRepresentation
10
PuttingItAll
1
TDDD10AIProgramming
CooperationAndCoordination1
CyrilleBerger
2
2/78
Lecturegoals
Lecturecontent
CooperativeSensing&Exploration
CooperativeStateEstimation
ExtractingPredicates
RobotExploration
Coalitions&Roles
DynamicRoleAssignment
CoalitionFormation
3/78
4/78
CooperativeSensing&Exploration
CooperativeStateEstimation
WhyStateEstimation?
ModelingSensornoise(1/2)
Robotsneedtobeawareof
theircurrentstateinorder
toperformmeaningful
actions!
WhereamIlocatedinthe
world?
Wherearethevictims?
7/78
Sensorsarerepresentedbyaprobabilistic
sensormodelp(z|x)
Answersthequestion:Whatisthe
probabilityformeasuringzwhengivenI
amlocatedinstatex?
Example:LaserScannerlocated1mfrom
thewallreturnsinaverageevery10th
time1.20m
TypicallyrepresentedbyaGaussian
8/78
ModelingSensornoise(2/2)
Datafromsensorsisnoisy,.e.g.,the
distancemeasurementofalaseratone
metercanbe1m±1cm
Sensornoiseistypicallymodeledbya
normaldistribution
Fullydescribedbymeanμandvariance
σ²
Gaussians(1/2)
Univariate:
Multivariate:
9/78
Gaussians(2/2)
10/78
StateEstimation
1D
ContinuesIntegrationofSensordata
accordingtoprobabilitydistributions
Sensorobservationsaretakeindifferent
coordinateframes,e.g.,camera,laser
Transformationof
StateEstimationistheprocessof
integratingmultipleobservationsto
estimateastate
2D
i.e.robotlocation,locationsof
11/78
12/78
ExampleofBayesianStateEstimation
Supposearobotobtains
measurementz
WhatisP(open|z)?
CausalvsDiagnosticReasoning
P(open|z)is
P(z|open)iscausal,i.e.,the
sensormodel
Oftencausalknowledgeiseasier
toobtain
Bayesruleallowsustouse
causalknowledge:
13/78
Example
14/78
GeneralFramework:RecursiveBayesianFiltering
zₜ:Sensorobservationattimet
xₜ:Stateattimet
P(z|open)=0.6P(z|¬open)=0.3
P(open)=P(¬open)=0.5
Likelihood(sensormodel)
Prior
Transition(ormotion)model
Marginalization:computemarginalprobability
zraisestheprobabilityofthebeliefthatthedoorisopen
15/78
16/78
AlgorithmsforBayesianFiltering
KalmanFilter:optimalforlinear
systemsandnormaldistributions,very
efficient,uni-modal
MonteCarloLocalization(Particle
Filter):goodforanydistribution,can
becomputationallyexpensive,multimodal
17/78
ImportanceofStateestimation
KalmanfiltervsSimpleAveraging
Triangulation
Kalmanfilteringcompared
toSimpleAveraging:Highly
ConfidentEstimatesare
moreStronglyWeighted
Kalmanfiltering
Simpleaveraging
18/78
MonteCarloLocalization(MCL)asObservationFilter
TheKalman-Filtercanonlyhandlea
singlehypotheses
However,colorthresholdingonasoccerfieldmight
confuseforexample“redt-shirts”withtheball
Consequently,Kalmanfilteringyieldspoorresults
MCL:Simultaneoustrackingof
multiplehypotheses
Canbeusedtofilter-outhypothesesweaklysupported
byobservationsovertime
19/78
20/78
MarkovLocalization
MonteCarloLocalization(MCL)
Goal:approachfordealingwith
arbitrarydistributions
21/78
KeyIdea:Samples
22/78
ParticleSet
Usemultiplesamplestorepresent
arbitrarydistributions
Setofweightedsamples
x[i]state
w[i]importanceweight
Thesamplesrepresenttheposterior
23/78
24/78
ParticlesforApproximation
ParticleFilter
RecursiveBayesfilter:
Particlesforfunctionapproximation
Themoreparticlesfallintoaninterval,the
higheritsprobabilitydensity.Howtoobtainsuch
samples?
25/78
ParticleFilterAlgorithm
Non-parametricapproach
Modelsthedistributionbysamples
Prediction:drawfromtheproposal
Correction:weightingbytheratiooftargetand
proposal
Themoresamplesweuse,thebetteristhe
estimate!
26/78
ParticleFilterAlgorithm
Sampletheparticlesusingthe
proposaldistribution
1
Computetheimportance
2
Resampling:“Replaceunlikelysamples
bymorelikelyones”
3
27/78
28/78
MonteCarloLocalization
ParticleFilterforLocalization
Eachparticleisaposehypothesis
Proposalisthemotionmodel
Correctionviatheobservationmodel
29/78
ParticleFilterforLocalization
30/78
Resampling
Neededaswehavealimitednumber
ofsamples
Survivalofthefittest:“Replace
unlikelysamplesbymorelikelyones”
“Trick”toavoidthatmany
samplescoverunlikelystates
31/78
32/78
Resampling
Roulettewheel
Binary
O(nlog
PhantomBalls:DevelopmentofProbabilityDistribution
Stochastic
universalsampling
Low
O(n)
33/78
Firstobservation
Secondobservation
Thirdobservation
34/78
PhantomBalls:DevelopmentofProbabilityDistribution
Fourthobservation
Fifthobservation
Sixthobservation
35/78
ExtractingPredicates
Case-Study:Extractingpredicatesforplayingsoccer
Case-Study:Extractingpredicatesforplayingsoccer
Predicatesareneededforsymbolic
Extendedpredicates:
Simplepredicatesofobjects(canbe
directlycomputedfrompositions):
Examples:
Predicatesarethebasisforactionselectionandstrategic
decisionmaking
Canbeconsideredasworldmodelabstractions
InOpponentsGoal(object),Objectinopponentgoal?
InOwnGoal(object),Objectinowngoal?
CloseToBorder(object),Thedistancetoanyborderisbeyonda
threshold?
FrontClear(),Neitheranotherobjectnortheborderisinfront?
InDefense(object),Objectinthelastthirdofthesoccerfield?
Computedbynormalizedgrids:
(fi:ℜxℜ⇒[0..1])
Discretizedintocells,e,g.,
10x10cmsize
ffree:indicatespositionsunder
theinfluenceoftheopponent
fcovered:indicatesposition
coveredbyteammates
fdesired:indicatestacticalgood
positions
37/78
38/78
RobotExploration
RobotExploration
Ateamofrobotshastoexplorean
initiallyunknownenvironmentbysensor
coverage
Findanassignmentsoftargetlocationsto
robotsthatminimizestheoverall
explorationtime
Variants
Centralizedcoordinationviaworldmodeldataexchange
Centralizedcoordinationwithassignmentoptimization
Decentralizedcoordinationbypeer-to-peer
communication
40/78
FrontierExploration
Robotsfuseandsharetheirlocalmaps
Thefrontiersbetweenfreespaceandunknown
areasarepotentialtargetlocations
Findagoodassignmentoffrontierlocationsto
robotstominimizeoverallexplorationtime
41/78
Example:NeedforExplicitCoordination
Levelofcoordination
Noexchangeofinformation
Implicitcoordination:Sharingajointmap
Communicationandfusionoflocalmaps
Centralmappingsystem
FrontierExploration(Yamauchietal.,98)
Explicitcoordination:Determinebettertarget
locationstodistributetherobots
CombinatorialProblem:“planner”forrobottargetassignment
42/78
ExplicitCoordination
Choosetargetlocationsatthefrontierto
theunexploredareabytradingoffthe
expectedinformationgainandtravel
costs
Reduceutilityoftargetlocations
whenevertheyareexpectedtobe
coveredbythesensorsofanotherrobot
Usecooperativesensingakadistributed
stateestimationtocomputethejoint
map
43/78
44/78
TheCoordinationAlgorithm
ExampleRevised
Determinethesetoffrontier
ComputeforeachrobotithecostVⁱ(x,y)for
reachingeachfrontiercell<x,y>
3
Settheutilityofallfrontiercellsto
4
Whilethereisonerobotleftwithouta
1
2
Determinearobotiandafrontiercell<x,y>which
statisfies:
(i,<x,y>)=argmax{i',<x',y'>}(U(x',y')-Vⁱ'(x',y'))
Reducetheutilityofeachtargetpoint<x',y'>in
thevisibilityareaofselected<x,y>accordingto:
U(x',y')←U(x',y')⨯(1-P(<x,y>,<x',y'>))
45/78
TypicalTrajectories
46/78
ExplorationTime
Left:implicit
Right:explicit
47/78
48/78
Drawbacks
TheassignmentconsideredsofarisaGreedyassignment:
Coalitions&Roles
Moreoptimalapproaches:
HungarianMethod
Computestheoptimalassignmentofjobstomachinesgivenafixedcostmatrix
Marketeconomy-basedapproaches(Auctions)
Robotstradewithtargets
Computationalloadissharedbetweentherobots
49/78
DynamicRoleAssignment
Amechanismtoefficientlycoordinateagents
PredefinedRoles(e.g.Attacker,Defender,
Role-specificbehaviorsselection
DynamicRoleAssignment
Assignment:MappingbetweenNrolesandM
Canbeaccordingtothecontext(e.g.teamformation)
Suitedfordynamicdomains(e.g.robot
ExampleRobotSoccer
Avoidswarmbehaviorandinference(e.g.neitherattackyourown
teammatesnorgetintothewayofanattackingordefending
robot)
Taskdecompositionandtask(re-)allocation(e.g.theplayerclosest
totheballshouldgototheball
Dynamicrolechanges(e.g.Ifaplayerisblocked,anothershould
takeover)
CoordinatingJointexecution(e.g.passingthe
52/78
GeneralAlgorithm
CaseStudy:CS-FreiburgSoccer
Assumptions:
Fixedorderingofroles{1,2,…,N},e.g.role1mustbeassignedfirst,
followedbyrole2,etc.
Eachagentcanbeassignedtoonlyonerole
Theutilityuijreflectshowappropriateagentiisforrolejgiventhe
context
Eachplayercanhaveoneoffour
goalie(fixed)
specialhardwaresetup,thusunabletochangethis
activeplayer(inchargeofdealingwiththeball)
canapproachtheballorbringtheballforwardtowardsthe
opponentgoal
strategicplayer:
forallagentsinparallel
I:=∅;//Committedassignmentswithordering
foreachrolej=1,…,N
computeutilityui,j;//Ownpreferenceofagenti
broadcastui,j;//Toallotheragents
end;
Waituntilallui,jarereceived//Fromalltheotheragents
foreachrolej=1,…,N
assignrolejtoagenti*=argmaxi∉I{ui,j};
I:=I∪{i*};//Addassignment
end;
end.
maintainsapositionbackinitsown
supporter:(supportseitheractiveor
indefensiveplayitcomplementstheteam’sdefensive
inoffensiveplayitpresentsitselftoreceiveapasscloseto
theopponentsgoal
53/78
RoleUtilities
Placement:eachrolehasa
preferredlocation,whichdepends
onthesituation:
ballposition,positionofteammatesand
opponents
defensivesituationorattack
computedbypotential
Utilityuijforeachrole:
“Negativeutility(costs)”forreachingthe
preferredlocationoftherole
Costsarecomputedfrompartialcostsfor
distance(ud),turnangle(ut),objectson
thepath(uo)
Weightedsumtoensureutilities
between0..1:Uij=wd*ud+wt*ut+wo*uo
54/78
DynamicRoleAssignment
activerole
Eachplayercomputesutilitiesuijandbroadcastsresults
Grouputility:
Considerallpossibleassignmentsandcomputethesummed
utilityfromeachagents’individualutilityforitsassignedrole
Taketheassignmentwiththehighestutilitysumassolution
(undertheassumptionthateveryagentdoesso)
Rolesarere-assignedonlywhen
strategicrole
supportrole
55/78
therolechangeissignificant,i.e.thenewutility>>old
utility(hysteresisfactortoavoidoscillation)
twoplayersagree(by
Notethatagentsmightliesince“opinion”aboutglobal
positioncandiffer(evenwithaglobalworldmodel)
56/78
ExampleforRoleSwitching(1/2)
AttackagainstTeamOsaka(Japan).Theattackingrobotisblockedbya
defenderandconsequentlyreplacedbyanunblockedplayer
57/78
ExampleforRoleSwitching(2/2)
DefenseagainstArtistiVeneti(Italy).Therolesactiveandstrategicplayer
areswitchedacoupleoftimes
58/78
Failedball-passing
CoalitionFormation
Apassinthesemi-finalagainsttheItalianARTItalyteam(RoboCup1999).Thiswasbased
onstandardplan:“ifitisnotpossibletoscoredirectly,waituntilsupporterarrives,then
makethepass”
59/78
CoalitionFormation
Necessarywhentasksaremoreefficientlysolvedbya
specificcombinationofagentcapabilities
E.g.Adisasterlocationrequiresambulanceandfire
Assignmentofgroupstotasksisnecessarywhentasks
cannotbeperformedbyasingleagent
E.g.asinglefirebrigadecannotextinguishalarge
Agroupofagentsiscalledacoalition
Acoalitionstructureisapartitioningofthesetofagents
intodisjointcoalitions
Anagentparticipatesinonlyonecoalition
Acoalitionmayconsistofonlyasingleagent
Generally,coalitionsconsistofheterogeneousagents
61/78
FireBrigadeExample
63/78
ApplicationsforCoalitionFormation
Ine-commerce,buyerscanformcoalitionstopurchasea
productinbulkandtakeadvantageofpricediscounts
(Tsvetovatetal.,2000)
InRealTimeStrategy(RTS)gamesgroupsofheterogeneous
agentscanjointlyattackbasesoftheopponent.Mixturesof
agentshavetobeaccordingtothedefensestrategyofthe
opponent
Distributedvehicleroutingamongdeliverycompanieswith
theirowndeliverytasksandvehicles(Sandholm1997)
Wide-areasurveillancebyautonomoussensornetworks
(Dang2006)
InRescue,teamformationtosolveparticularsub-problems,
e.g.largerrobotsdeploysmallerrobotsintoconfinedspaces
62/78
FireBrigadeExample
64/78
ThreeActivitiesinCoalitionFormation
ProblemFormulation
Coalitionstructuregeneration:
AgroupofagentsS⊆Aiscalledacoalition,
whereAdenotesthesetofallagentsandS≠∅
Solvingtheoptimizationproblemineachcoalition:
Acoalitionstructure(CS)partitionsthesetof
agentsintocoalitions
ThevalueofeachcoalitionSisgivenbya
functionvS
Partitioningoftheagentsintoexhaustiveanddisjoint
Insidethecoalitions,agentswillcoordinatetheiractivities,but
agentswillnotcoordinatebetweencoalitions
Thecoalitionofalltheagentsiscalledgrandcoalition
Poolingthetasksandresourcesoftheagentsinthecoalitionandsolving
thejointproblem
Thecoalitionobjectivecouldbetomaximizethemonetaryvalue,orthe
overallexpectedutility
Dividingthevalueofthegeneratedsolution:
Intheend,eachagentwillreceiveavalue(moneyorutility)asaresult
ofparticipatinginthecoalition
Insomeproblems,thecoalitionvaluetheagentshavetoshareis
negative,beingasharedcost
65/78
Coalitionstructuregeneration
Thevalueofacoalitionstructure
isgivenby:
Eachcoalitionvalueisindependentofnon-membersactions
CS*isthesocialwelfaremaximizingcoalition
structure
66/78
SpecialCoalitionValues
Coalitionvaluesaresuper-additiveiffforevery
pairofdisjointcoalitionsS,T⊆A:vS∪T≥vS+vT
Ifcoalitionvaluesaresuper-additive,thenthecoalitionstructure
containingthegrandcoalitiongivesthehighestvalue
Agentscannotdoworsebyteamingup
V(CS)=∑{S∊CS}US
Thegoalistomaximizethe
socialwelfareofasetofagentsA
byfindingacoalitionstructure
thatsatisfies:
Thecoalitionvaluesaresub-additiveifffor
everypairofdisjointcoalitionsS,T⊆A:vS∪T<
Ifcoalitionvaluesaresub-additive,thenthecoalitionstructure
vS+vT
{{a}|a∈A}inwhichnoagentcooperatesgivesthehighest
value
IstheambulancerescuetaskintheRoboCupRescuedomain
super-additive,sub-additive,ornoneofboth?
CS*=argmax{CS∊Partitions(A)}V(CS)
67/78
68/78
Coalitionstructuregeneration
Coalitiongraph
Input:allpossiblecoalitionsandtheirvalues
A={1,2,3,4}
ForNagentsthenumberofpossiblecoalitionsis
2^N-1andthenumberofpossiblecoalition
structuresisN^(N/2)
69/78
CoalitionStructureSearchTime
NodesrepresentCoalitionStructures
Arcsrepresenteithermerges(downwards)orsplits
(upwards)
70/78
ApproximateSolutiontoStructureSearch
Canweapproximatethesearchby
visitingonlyasubsetofLnodes?
ChooseasetL(asubsetofallcoalitionsof
A)andpickthebestcoalitionseen:
CSL*=argmax{CS∊L}
Onerequirementistoguaranteethat
thefoundcoalitionstructureiswithina
worstcaseboundfromoptimal:
k*V(CSL*)≥V(CSL)
Tosearchthewholecoalitiongraphfortheoptimalcoalitionstructure
isintractable(onlyfeasibleif|A|<15)
71/78
72/78
ApproximateSolution
CoalitionStructureSearchAlgorithm
Searchthebottomtwolevelsofthe
coalitionstructuregraph
2
Continuewithbreadth-firstsearchfrom
thetopofthegraphaslongasthereis
timeleft,oruntiltheentiregraphhas
beensearched
3
Returnthecoalitionstructurethathasthe
highestwelfareamongthoseseensofar
Ifthebottomtwolevelsofthegraph
areconsideredthen:
1
k=|A|
andthenumberedofnodessearchesis
n=2^(|A|-1)
itcanbeproventhatnoothersearch
algorithmcandobetterboundK
whilesearchingn=2^(|A|-1)orfewer
73/78
Casestudy:ResQFreiburgTaskAllocation
NambulanceteamshavetorescueMciviliansafteran
earthquake
CiviliansarecharacterizedbyBuriedness,DamageandHitPoints
Costsarethetimetorescueacivilian,composedofthe
coalition’sjointmaxtraveltimetoreachthevictim,andthe
timeneededfortherescue
Theoverallutilityisthenumberofrescuedcivilians(theciviliansbroughttoa
refuge)
Weconsideredtheambulancerescuetaskassuper-additive
Therescueoperationitselfissuper-additive
74/78
ProblemasSequenceAssignment
AssignasequenceRoftasks(herevictims)tothegrand
coalitionofagentsA(hereambulances)
R=<r1,r2,…,rN>whereridenotesarescuetaskandithe
positioninthesequence
U(R)denotesthepredictedutility(thenumberof
survivors)whenexecutingsequenceR
Hence,theproblemisfindtheoptimalsequencefrom
thesetofallpossiblesequences:R*=argmaxU(R)
Enumeratingallpossiblesequencesisintractable(N!)
Greedysolutions:
Prefervictimsthatcanberescuedfast(small
Preferurgentvictims(high
Aretherepossiblysituationswheretheambulanceteam
needstobesplit?
75/78
76/78
ResultsRoboCup2004
Summary
CooperativesensingwithKalmanFilterandParticles
Filter
Makeuseofsensorinformation
Coordinationtechniqueforexplorationofthe
environment
Dynamicroleassignmentisanefficientmethodfor
teamcoordination
Actionselectionandcoordinationareessentialwhen
actingingroups
Coalitionformationistheprocessoffindingthe“social
welfare”coalitionstructureamongasetofagents
77/78
78/78
Download