CMU SCS Large Graph Mining: Power Tools and a Practitioner’s guide Task 4: Center-piece Subgraphs Faloutsos, Miller and Tsourakakis CMU KDD'09 Faloutsos, Miller, Tsourakakis P5-1 CMU SCS Outline • • • • • • • • Introduction – Motivation Task 1: Node importance Task 2: Community detection Task 3: Recommendations Task 4: Connection sub-graphs Task 5: Mining graphs over time … Conclusions KDD'09 Faloutsos, Miller, Tsourakakis P5-2 CMU SCS Detailed outline • Problem definition • Solution • Results H. Tong & C. Faloutsos Center-piece subgraphs: problem KDD'09 Faloutsos, Miller, Tsourakakis definition and fast solutions. In KDD, 404-413, 2006. P5-3 CMU SCS Center-Piece Subgraph(Ceps) B • Given Q query nodes • Find Center-piece ( b ) • Input of Ceps – Q Query nodes – Budget b – k softAnd number C A B B • App. – – – – KDD'09 Social Network Law Inforcement Gene Network … Faloutsos, Miller, Tsourakakis AA C C P5-4 CMU SCS Challenges in Ceps • Q1: How to measure importance? • (Q2: How to extract connection subgraph? • Q3: How to do it efficiently?) KDD'09 Faloutsos, Miller, Tsourakakis P5-5 CMU SCS Challenges in Ceps • Q1: How to measure importance? • A: “proximity” – but how to combine scores? • (Q2: How to extract connection subgraph? • Q3: How to do it efficiently?) KDD'09 Faloutsos, Miller, Tsourakakis P5-6 CMU SCS AND: Combine Scores • Q: How to combine scores? KDD'09 Faloutsos, Miller, Tsourakakis P5-7 CMU SCS AND: Combine Scores • Q: How to combine scores? • A: Multiply • …= prob. 3 random particles coincide on node j KDD'09 Faloutsos, Miller, Tsourakakis P5-8 CMU SCS K_SoftAnd: Relaxation of AND Noise Disconnected Communities What if AND query No Answer? KDD'09 Faloutsos, Miller, Tsourakakis P5-9 CMU SCS K_SoftAnd: Combine Scores Generalization – SoftAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? KDD'09 Faloutsos, Miller, Tsourakakis P5-10 CMU SCS K_SoftAnd: Combine Scores Generalization – softAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? A: Prob(at least kout-of-Q will meet each other at j) KDD'09 Faloutsos, Miller, Tsourakakis P5-11 CMU SCS AND query vs. K_SoftAnd query 0.0103 0.4505 5 5 x 1e-4 0.0046 0.1010 0.1010 0.0710 11 0.0046 0.0019 11 12 4 0.0046 0.1010 0.1010 10 0.2267 0.4505 0.0710 0.0710 2 6 10 0.0024 13 3 0.1010 0.1010 7 1 9 8 0.4505 0.0103 And Query KDD'09 0.0046 13 3 1 12 4 0.0019 0.0019 2 6 0.0046 0.0046 7 9 8 0.0103 2_SoftAnd Query Faloutsos, Miller, Tsourakakis P5-12 CMU SCS 1_SoftAnd query = OR query 0.0103 5 0.1617 0.1617 0.1387 11 12 4 0.1617 0.1617 10 0.0849 13 3 1 0.0103 KDD'09 0.1387 0.1387 2 6 0.1617 0.1617 7 9 8 0.0103 Faloutsos, Miller, Tsourakakis P5-13 CMU SCS Detailed outline • Problem definition • Solution • Results KDD'09 Faloutsos, Miller, Tsourakakis P5-14 CMU SCS Case Study: AND query H.V. Jagadish 15 Laks V.S. Lakshmanan 10 R. Agrawal Jiawei Han 10 1 2 Heikki Mannila Christos Faloutsos KDD'09 1 Corinna Cortes 1 6 Padhraic Smyth 1 1 V. Vapnik 4 13 3 Daryl Faloutsos, Miller, Tsourakakis Pregibon 1 1 M. Jordan 6 P5-15 CMU SCS Case Study: AND query H.V. Jagadish 15 Laks V.S. Lakshmanan 10 R. Agrawal Jiawei Han 10 1 2 Heikki Mannila Christos Faloutsos KDD'09 1 Corinna Cortes 1 6 Padhraic Smyth 1 1 V. Vapnik 4 13 3 Daryl Faloutsos, Miller, Tsourakakis Pregibon 1 1 M. Jordan 6 P5-16 CMU SCS H.V. Jagadish 15 10 Laks V.S. Lakshmanan database 13 R. Agrawal Jiawei Han Umeshwar Dayal 3 Bernhard Scholkopf 5 V. Vapnik 27 4 KDD'09 2 2_SoftAnd query 3 Peter L. Bartlett 3 Statistic 2 M. Jordan Alex J. Smola Faloutsos, Miller, Tsourakakis P5-17 CMU SCS Conclusions Proximity (e.g., w/ RWR) helps answer ‘AND’ and ‘k_softAnd’ queries KDD'09 Faloutsos, Miller, Tsourakakis P5-18