Based on “Cascading Behavior in Networks: Algorithmic and Economic Issues” in Algorithmic Game Theory (Jon Kleinberg, 2007) and Ch.16 and 19 of Networks, Crowds, and Markets: Reasoning about a Highly Connected World (David Easley, Jon Kleinberg, 2010) Motivation Simple Example Models Influence Maximization Similar Work What is a network cascade? Why do we want to study cascading behavior? What are some of the interesting questions to be raised? ◦ A series of correlated behavior changes ◦ ◦ ◦ ◦ Social Contexts Epidemic Disease Viral Marketing Covert Organization Exposure ◦ ◦ ◦ ◦ How can we model a cascade? What can initiate or terminate a cascade? What are some properties of cascading behavior? Can we identify subsets of nodes or edges that have greater influence in a cascade than others? A jar either contains 2 red and 1 blue marble or 2 blue and 1 red marble People sequentially come and remove 1 marble and verbally announce which configuration they believe to be present (there is an incentive for guessing correctly) Claim: All guesses beyond the first two are fixed if they match Recall Bayes’ Rule: Pr 𝐴 𝐵 = Pr A ∗Pr 𝐵𝐴 Pr(𝐵) Suppose first student draws blue. His guess is blue. 2 (Pr 𝑚𝑎𝑗𝑜𝑟𝑖𝑡𝑦 𝑏𝑙𝑢𝑒 𝑏𝑙𝑢𝑒) = ) 3 If the second student draws blue, their choice is trivial. If it is red, then we have: 1 Pr 𝑚𝑎𝑗𝑜𝑟𝑖𝑡𝑦 𝑏𝑙𝑢𝑒 𝑏𝑙𝑢𝑒, 𝑟𝑒𝑑) = Pr 𝑚𝑎𝑗𝑜𝑟𝑖𝑡𝑦 𝑟𝑒𝑑 𝑏𝑙𝑢𝑒, 𝑟𝑒𝑑) = 2 and they should announce red to break the tie Third student: suppose first two students drew blue, third draws red, yet he announces blue 2 ◦ Pr 𝑚𝑎𝑗𝑜𝑟𝑖𝑡𝑦 𝑏𝑙𝑢𝑒 𝑏𝑙𝑢𝑒, 𝑏𝑙𝑢𝑒, 𝑟𝑒𝑑) = 3 All future students have the same information as the third student and a cascade occurs Simple Model ◦ Consider a social network ((𝐺 = 𝑉, 𝐸 such that 𝑉 = 𝑠𝑒𝑡 𝑜𝑓 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝑠 and 𝑢, 𝑣 ∈ 𝐸 if u and v are engaged in an activity or friendship ◦ Now consider a situation in which each node has a choice of behavior – an original behavior, A, or a new behavior, B with the following incentive given an edge (v, w) and parameterized by q, 0 ≤ 𝑞 ≤ 1 If v and w both choose A, they receive a payoff of q If v and w both choose B, they receive a payoff of 1-q If v and w choose differing behaviors, they receive nothing Let 𝑑𝑣 denote the degree of node v and 𝑑𝑣 𝐴 , 𝑑𝑣 𝐵 denote the number of neighbors with behavior A and B, respectively ◦ Aside: A node should adopt behavior A if 𝑑𝑣 𝐵 < q𝑑𝑣 and B otherwise Each node updates its behavior simultaneously S - set of nodes adopting behavior B ℎ𝑞 𝑆 - set of nodes adopting B after 1 update with threshold q ℎ𝑞𝑘 (𝑆) - set of nodes adopting B after k updates with threshold q A node w is converted by a set S if for some k, w ∈ ℎ𝑞𝑗 (𝑆) for all j ≥k A set S is contagious if every node is converted by S Contagion threshold – the maximum q for which there exists a finite contagious set (also sometimes called cascade capacity) Example ◦ 2-way infinite path ◦ q = ½, ◦ S = {0} -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 t=0 t=1 t=2 Example ◦ 2-way infinite path ◦ q = ½, ◦ S = {-1,0,1} -2 -1 0 1 2 -2 -1 0 1 2 The contagion threshold of this graph is ½: any set with larger q can never extend! In fact, we can prove that the maximum contagion threshold of any graph is ½! t=0 t=1 Question: what causes cascades to stop? Definition: a cluster of density p is a set of nodes such that each node in the set has at least a p fraction of its neighbors in the set Progressive vs. Non-Progressive ◦ Our prior model was non-progressive – nodes could change back and forth between states ◦ A progressive model is also interesting – once a node switches from A to B, it remains B from then on (consider the behavior of pursuing an advanced degree) ◦ Intuition: it is easier to find contagious sets with a progressive model ◦ Actuality: for any graph G, both models have the same contagion threshold Our model thus far is limited ◦ Threshold is uniform for nodes – everyone is just as predisposed to study algorithms as you are ◦ All neighbors have equal weight – all your facebook friends are just as important as your immediate family ◦ Undirected graph – the influence you have on your boss is the same as he has on you We will now introduce several models to ameliorate these limitations Linear threshold model ◦ Goal: allow nodes to weigh influence of their neighbors differently, and assume that each node’s threshold is chosen uniformly at random Non-negative weights b representing influence 𝑤∈𝑁(𝑣) 𝑏𝑣𝑤 ≤ 1 Each node has a threshold θv chosen uniformly at random from [0,1] indicating the fraction of v’s neighbors that must adopt the behavior before he does ◦ Definition: a node is activated when it switches from behavior A to B A node becomes active when 𝑎𝑐𝑡𝑖𝑣𝑒 𝑤∈𝑁(𝑣) 𝑏𝑣𝑤 ≥ θv ◦ Problem: neighbor influence is strictly additive. General threshold model ◦ Each node v now has a function gv defined on subsets of N(v). ∀𝑋 ⊆ 𝑁 𝑣 , 0 ≤ 𝑔𝑣 𝑋 ≤ 1 ◦ Furthermore, if 𝑋 ⊆ 𝑌 𝑡ℎ𝑒𝑛 𝑔𝑣 𝑋 ≤ 𝑔𝑣(𝑌) ◦ A node now becomes active when 𝑔𝑣 X ≥ θ𝑣 , 𝑋 = 𝑖 ∈ 𝑁 𝑣 𝑖 𝑖𝑠 𝑎𝑐𝑡𝑖𝑣𝑒} Cascade model ◦ Idea of ‘catching’ behavior from your friends ◦ Probabilistic – whenever an edge (u,v) exists such that u is active and v is not, u is given a chance to activate v that depends on u, v, and also the set of nodes that have already tried and failed to activate v. Cascade Model, cont. ◦ Replace the g function from the General Threshold Model with an incremental function that returns the probability of success of activating a node v given initiator u and a set of neighbors X that already attempted and failed ◦ Provably equivalent to general threshold model in utility Independent Cascade Model ◦ Incremental function is independent of X and depends only on u andv Domingos and Richardson – influential work that posed the question: if we can convince a subset of individuals to adopt a new product with the goal of triggering a cascade of future adoptions, who should we target? NP-hard, even for many simple special cases of the models we’ve discussed Can construct instances of those models for which approximation within a factor of n is NP-hard Proof of inapproximability relies on ‘knifeedge’ property Kempe et al. – Submodularity of the influence function allows approximation within 1 𝑒 1 − − ε (about 63%) ◦ A function is submodular if for all sets 𝑋 ⊆ 𝑌 and all elements 𝑣 ∉ 𝑌, 𝑓 X ∪ 𝑣 − 𝑓 𝑋 ≥ 𝑓 𝑌 𝑈 𝑣 − 𝑓(𝑌) By identifying instances where the influence function f is submodular and monotone, we can make use of the following theorem of Nemhauser, Wolsey, and Fisher: Identifying instances in which we have a submodular influence function ◦ Any instance of the Cascade Model in which the incremental functions pv exhibit diminishing returns has a submodular influence function ◦ Any instance of the Independent Cascade Model has a submodular influence function ◦ Any instance of the General Threshold Model in which all the threshold functions gv are submodular has a submodular influence function The anchored k-core problem (Bhawalker et al.) ◦ Model – each user has a cost for maintaining engagement but derives benefits proportional to the number of engaged neighbors ◦ A k-core is the maximal induced subgraph with minimum degree at least k ◦ Anchored k-core – the maximal induced subgraph for which every unanchored vertex has minimum degree at least k ◦ Corresponds with the problem of preventing cascades of withdrawals ◦ 𝑂(𝑚 + 𝑛log𝑛) exact solution to the 2-core problem ◦ NP-hard to even approximate within a factor of 𝑂(𝑛1−ε )for k ≥ 3 Cascade scheduling (Chierichetti et al.) ◦ Ordering nodes in a cascade to maximize a particular outcome Identifying failure susceptibility (Blume et al.) ◦ Notion of cascading failure ◦ μ-risk – maximum failure probability of any node in the graph ◦ What about the structure of the underlying graph causes it to have high μ-risk? 1. 2. 3. 4. 5. 6. 7. Lawrence Blume, David Easley, Jon Kleinberg, Robert Kleinberg, and Éva Tardos. 2011. Which Networks are Least Susceptible to Cascading Failures?. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS '11). IEEE Computer Society, Washington, DC, USA, 393-402. K. Bhawalkar, J. Kleinberg, K. Lewi, T. Roughgarden, and A. Sharma. Preventing Unraveling in Social Networks: The Anchored k-Core Problem. In ICALP '12. Flavio Chierichetti, Jon Kleinberg, Alessandro Panconesi. How to Schedule a Cascade in an Arbitrary Graph. In Proceedings of EC 2012. Pedro Domingos and Matt Richardson. Mining the network value of customers. In Proc. 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 57– 66, 2001. D. Easley, J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010. David Kempe, Jon Kleinberg, and Eva Tardos. Maximizing the spread of influence in a social network. In Proc. 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 137–146, 2003. J. Kleinberg. Cascading Behavior in Networks: Algorithmic and Economic Issues. In Algorithmic Game Theory (N. Nisan, T. Roughgarden, E. Tardos, V. Vazirani, eds.), Cambridge University Press, 2007.