Submodularity for Distributed Sensing Problems Zeyn Saigol IR Lab, School of Computer Science University of Birmingham 6th July 2010 Outline 1. 2. 3. 4. 5. What is submodularity? Non-myopic maximisation of information gain Path planning Robust maximisation and minimisation Summary * These slides are borrowed from Andreas Krause and Carlos Guestrin (see http://www.submodularity.org/) IRLab, 6 Jul 2010 Submodularity 2/16 Set functions Submodularity in AI has been popularised by Andreas Krause and Carlos Guestrin Finite set V = {1,2,…,n} Function F: 2V R Example: F(A) X1 “Fever” = IG(XA; Y) = H(Y) – H(Y | XA) = y,xA P(xA) [log P(y | xA) – log P(y)] Y “Sick” Y “Sick” X2 “Rash” X2 “Rash” F({X1,X2}) = 0.9 IRLab, 6 Jul 2010 X3 “Male” F({X2,X3}) = 0.5 Submodularity 3/16 Submodular set functions Set function F on V is called submodular if For all A,B V: F(A)+F(B) F(AB)+F(AB) + A A BB + AB Equivalent diminishing returns characterization: Submodularity: B A + S Large improvement + S Small improvement For AB, sB, F(A {s}) – F(A) F(B {s}) – F(B) IRLab, 6 Jul 2010 Submodularity 4/16 Example problem: sensor coverage Place sensors in building Node predicts values of positions with some radius Possible locations V For A V: F(A) = “area covered by sensors placed at A” Formally: W finite set, collection of n subsets Si W For A V={1,…,n} define F(A) = |iA Si| IRLab, 6 Jul 2010 Submodularity 5/16 Set coverage is submodular A={S1,S2} S1 S2 S’ S’ F(A{S’})-F(A) ≥ S1 F(B{S’})-F(B) S2 S3 S4 IRLab, 6 Jul 2010 S’ B = {S1,S2,S3,S4} Submodularity 6/16 Approximate maximization Given: finite set V, monotonic submodular function F(A) Want: A* V such that NP-hard! Y “Sick” Greedy algorithm: M X1 Start with A0 = {}; “Fever” For i = 1 to k si := argmaxs F(Ai-1{s})-F(Ai-1) Ai := Ai-1{si} IRLab, 6 Jul 2010 Submodularity X2 “Rash” X3 “Male” 7/167 Guarantee on greedy algorithm Theorem [Nemhauser et al ‘78] Given a monotonic submodular function F, F()=0, the greedy maximization algorithm returns Agreedy F(Agreedy) (1-1/e) max|A| k F(A) ~63% Sidenote: Greedy algorithm gives 1/2 approximation for maximization over any matroid C! [Fisher et al ’78] IRLab, 6 Jul 2010 Submodularity 8/16 Example: Submodularity of info-gain Y1,…,Ym, X1, …, Xn discrete RVs F(A) = IG(Y; XA) = H(Y)-H(Y | XA) F(A) is always monotonic However, NOT always submodular Theorem [Krause & Guestrin UAI’ 05] If Xi are all conditionally independent given Y, then F(A) is submodular! Y1 X1 IRLab, 6 Jul 2010 Y2 X2 X3 Y3 Hence, greedy algorithm works! X4 In fact, NO practical algorithm can do better than (1-1/e) approximation! 9 Submodularity 9/16 Information gain with cost Instead of each sensor having the same measurement cost, variable cost C(X) for each node Aim: max F(A) s.t. C(A) B where C(A)=XAC(X) In this case, construct every possible 3-element subset of V, and run greedy algorithm on each Greedy algorithm selects additional nodes X by maximising F ( A X ) F ( A ) C(X ) Finally choose best set A Maintains (1 − 1/e) approximation guarantee IRLab, 6 Jul 2010 Submodularity 10/16 Path planning maxA F(A) or maxA mini Fi(A) subject to So far: |A| k In practice, more complex constraints: Locations need to be connected by paths Sensors need to communicate (form a routing tree) [Krause et al., IPSN 06] [Chekuri & Pal, FOCS ’05] [Singh et al, IJCAI ’07] Lake monitoring IRLab, 6 Jul 2010 Building monitoring Submodularity 11/16 Informative path planning So far: max F(A) s.t. |A| k Most informative locations Robot needs to travel might be far apart! between selected locations s4 2 2 s2 1 s10 IRLab, 6 Jul 2010 s1 1 1 1 s11 1 1 s3 s5 Locations V nodes in a graph C(A) = cost of cheapest path connecting nodes A max F(A) s.t. C(A) B Submodularity 12/16 The pSPIEL Algorithm [K, Guestrin, Gupta, Kleinberg IPSN ‘06] pSPIEL: Efficient nonmyopic algorithm (padded Sensor Placements at Informative and costEffective Locations) g2,2 g1,2 g1,1 g1,3 g2,1 C1 g2,3 C2 S1 SB g4,3 g4,1 g4,4 g3,1 g3,3 C4 g4,2 IRLab, 6 Jul 2010 g3,2 g3,4 C3 Select starting and ending location s1 and sB Decompose sensing region into small, well-separated clusters Solve cardinality constrained problem per cluster (greedy) Combine solutions using orienteering algorithm Smooth resulting path Submodularity 13/16 Limitations of pSPEIL Requires locality property – far apart observations are independent Adaptive algorithm [Singh, Krause, Kaiser IJCAI‘09] Just re-plans on every timestep Often this is near-optimal; if it isn’t, have to add an adaptivity gap term to the objective function U to encourage exploration IRLab, 6 Jul 2010 Submodularity 14/16 Submodular optimisation spectrum Maximization: A* = argmax F(A) Sensor placement Informative path planning Active learning … Optimise for worst case: Minimization: A* = argmin F(A) Structure learning (A* = argmin I(XA; XVnA)) Clustering MAP inference in Markov Random Fields IRLab, 6 Jul 2010 Submodularity 15/16 Summary Submodularity is useful! IRLab, 6 Jul 2010 Submodularity 16/16