Midge Cozzens Rutgers University NIMBIOS Graph Workshop August 16-18, 2010 What is Viral Marketing? Start off with a small group of individuals and spread by “word-of-mouth”, email, etc. Initial individual tells two of his friends and they tell… Not necessarily selling a product, but attempts to change behavior or condition, to activate! Spread of viruses, disruptions etc. through an ecosystem. • Two people start singing • Next their friends join in • By the end of the song, the whole bar is singing Mariah Carey’s Fantasy…true story Small-Scale Behavioral Viral Spreading Graph Theory - Viral spread Contagious Progressive & Nonprogressive Weighted edges “Target” node Game Theory • Prisoners’ Dilemma Example • Two choices • Confess • Don’t Confess • Payoff • Prisoners’ Outcomes • Payoff Table • Zero Sum and Nonconstant Sum Games • Nash Equilibrium Don’t Confess Confess Confess (10,10) (0,20) Don’t Confess (20,0) (1,1) • Apply Game Theory to Viral Marketing • Players - individuals • Find Best Strategies for Changing Behaviors Game Theory and Graph Theory Graph Vertices = individuals Edge (i,j) exists if i and j have a relationship Weights on edges = strength of relationship Directed graph Directed edge, or arc (k,g) exists if k has influence over g Varying Applications Strategies of political parties in getting voters to vote for their candidate. REU student Jim Manning showed that the optimal strategy for the Republicans in 2008 was to go after moderates first then bring in the conservatives. Had they done that they would have taken 60% of the vote and beat Obama. Instead they brought in Sarah Palin to get the conservative base in place. Another Application Change the behavior of college students concerning unprotected sex. REU student Jordan Mitchell wrote a paper to be published in a sociology journal showing strategies to convince students to use protection. She used real data and measured parameters governing use such as attractiveness, drinking, multiplicity of partners, etc. Another Application Volvo car dealers have decided to concentrate on online marketing to sell cars across the country. Ajay Matapalli, REU student, developed an optimal progressive strategy for marketing where the dealers gained significantly and the incentives were far more lucrative then would have been just walking in off the street. Incentive Model Individual gain is represented by v(xh) where xh represents an individual (or in terms of a graph, a specific node). The subsequent factor is the gain (loss) that the individual receives from his or her peers owning the product as well. The inspiration for separating the individual and collective gain came from H. Peyton Young’s work. A simple example would be if my neighbor had sneakers, then I would gain the enjoyment of being able to run with him if I purchased sneakers. Incentive Model u(xh , xj) is a binary function, where xj is any node in the graph G not equal to xh, such that if there exists an edge between xh and xj , and xj owns the product, then the function equals 1; if there doesn’t exist an edge or xj doesn’t own the product, then let the function equal 0. r = constant that represents the benefit of a node, which you are connected to having the product. Incentive Model Edge weights (the strength of the connection between two people); contained in the set (0, 1]. w(xh , xj) be a function, where xj is any node in the graph G not equal to xh, such that if there exists an edge between xh and xj , then the output value is the weight of the edge; if there doesn’t exist an edge, then the output value is 0. This accounts for the fact that all connections which you have to other people are not equal; thus the edges with aren’t as strong don’t influence the buyer node as much. Total Effects in the Model The total effects of viral marketing is the sum all the benefits from each peer connection ∑ u(xh , xj) w(xh , xj)r The graph theory comes into play here because when forming this function we are concerned with the converted node who is reaching out to the new potential customers nodes (its neighbors). q = constant that represents a value that will normalize the weight of the edge to use in our final equation. y(xh , xo) = the weight of the edge coming from the previously converted node if there is one; if there is no previously converted node, then this function equals The final incentive equation is: I(xh) = Incentive for Individual xh I(xh) = z(xh) – v(xh) – y(xh , xo)q - [ ∑ u(xh , xj) w(xh , xj)r ] The incentive must at least be equal to the gap between the cost of the product to the individual and the gain and ease of the buying it. Overall Gain Model G(xh) Overall Gain to Individual xh G(xh) = v(xh) + [ ∑ u(xh , xj) w(xh , xj)r ] + y(xh , xo)q Profit Model B = cost of marketing I = incentive M = cost of manufacture or equivalent T = retail price k = number expected to purchase product C(xh) Cost to Company for xh C(xh) = I(xh) + B/k + M/k P(xh) Profit for Company from xh P(xh) = T – C(xh) Payoff Matrix Data for table Consumer Research Past history data CDC data for spread of disease, Public and World Health data on the effectiveness of innoculations Tipping Point and Equilibrium Graph Theory Questions Find the k most influential nodes to market to: In general, identifying the k most influential nodes is NP-hard. A natural greedy algorithm exists which is a 1−1/e−ε approximation for selecting a target set of size k, using probabilistically determined simulations. Tuberculosis Network SW Oklahoma 2002 Spread of Viruses and Innoculation Social Networks relative to virus spreading most often look like stars linked together as in the next slide which shows a portion of the first to set to contract TB and their contacts. Current procedures focus on inoculating the vulnerable -- often the very young and the very old. Network analysis tells us that it may be smarter, and more efficient, to focus on the spreaders -those with many contacts to many groups. Possible Research Directions Find classes of graphs where finding the k-most influential nodes can be done in polynomial time. Consider ways of subdividing the social network into subgraphs where the k-most influential nodes are easy to find. What type of dominating sets in a graph should one look for? How does the connectivity of a social network influence spread? What other connectivity parameters make sense? Research Directions What other connectivity parameters make sense? Once the most influential node(s) are targeted, how do you reach other (or all) nodes in the graph? • Find a Hamiltonian path – hard! • Find an Eulerian path – easy if connected and vertices have only 2 odd degree vertices; • Partition the graph into connected components with one target node in each component and add or subtract edges so Eulerian path can be found in each. Find an euler path Remove 3 edges Fleury’s Algorithm to get path References Influential Nodes in a Diffusion Model for Social Networks (David Kempe, Jon Kleinberg, Eva Tardos): http://www-rcf.usc.edu/~dkempe/publications/influential-nodes.pdf Diffusion on Social Networks (Matthew O. Jackson): http://economiepublique.revues.org/docannexe1777.html The Diffusion of Innovations in Social Networks (H. Peyton Young): http://www.brookings.edu/~/media/Files/rc/reports/1999/05fixtopicname_young/diffusion.pdf Information Diffusion in Online Social Networks (Lada Adamic): http://www.eecs.harvard.edu/~parkes/nagurney/adamic.pdf Diffusion in Complex Social Networks (Dunia Lopez-Pintado): http://www.sisl.caltech.edu/pubs/diffusion.pdf Social Networks and the Diffusion of Economic Behavior (Matthew O. Jackson): http://www.stanford.edu/~jacksonm/yer-netbehavior.pdf Cascading Behavior in Networks: Algorithmic and Economic Issues (Jon Kleinberg): http://www.cs.cornell.edu/home/kleinber/agtbook-ch24.pdf