Modeling Disease Transmission Across Social Networks DIMACS seminar February 7, 2005 Stephen Eubank Virginia Bioinformatics Institute Virginia Tech eubank@vt.edu Simulation Science Laboratory Variations on a Theme I. Estimating a Social Network II. Varieties of Social Networks III. Characterizing Networks for Epidemiology Simulation Science Laboratory Translation • Compute structural properties of very large graphs – Which ones? • Are local properties enough? • Structural properties should be robust – How? need efficient algorithms • Generate constrained random graphs – for experiment • Chung-Lu, Reed-Molloy, MCMC – for analysis • preserve independence as much as possible Simulation Science Laboratory If not uniform mixing, what? Network model ODE model Homogenous Isotropic ? ... N 2 alternative ~2 networks Do Local Constraints Fix Global Properties? • N vertices ~ 2N2 graphs (non-identical vertices few symmetries) • E edges ~ N2E graphs • Degree distribution ?? graphs • Clustering coefficient ?? graphs • What additional constraints ?? graphs equivalent w.r.t. epidemics? Simulation Science Laboratory Estimating a social network • Synthetic population • Survey (diary) based activity templates • Iterative solution to a large game – Assigning locations for activities (depends on travel times) – Planning routes – Estimating travel times (depends on activity locations) Simulation Science Laboratory Example Synthetic Household QuickTime™ and a Graphi cs decompressor are needed to see thi s picture. Qu i c k T i m e ™ a n d a Gra p h i c s d e c o m p r e s s o r a re n e e d e d to s e e th i s p i c tu re . Qu i c k T i m e ™ a n d a Gra p h i c s d e c o m p r e s s o r a re n e e d e d to s e e th i s p i c tu re . QuickTime™ and a Graphi cs decompressor are needed to see thi s picture. QuickTime™ and a Graphi cs decompressor are needed to see thi s picture. Age 26 26 7 Income $27k $16k $0 Status worker worker student Automobile Simulation Science Laboratory Example Route Plans SHOP WORK first person in household HOME second person in household LUNCH WORK SHOP DOCTOR HOME Estimating Travel Times by Microsimulation intersection with multiple turn buffers (not internally divided into grid cells) single-cell vehicle multiple-cell vehicle 7.5 meter 1 lane cellular automaton grid cells Typical Family’s Day Carpool Work Lunch Work Carpool Shopping Home Car Home Car Daycare Bus time School Simulation Science Laboratory Bus Others Use the Same Locations time Simulation Science Laboratory Time Slice of a Social Network Simulation Science Laboratory Activities Adapt to Situation Home Home Simulation Science Laboratory # deaths per initial infected by day 100 Example: Smallpox Response Efficacy Part II: Varieties of Social Networks • Definition of vertex – People – Concepts (location, role in society, group) • Definition of edge – Effective contact – Proximity • Weights – Edges: Interaction strength / probability of transmission – Vertices: “importance” • Time dependence • Directionality Simulation Science Laboratory A Social Network: multipartite labeled graph People (8.8 million) Vertex attributes: • age • household size • gender • income •… Simulation Science Laboratory A Social Network: bipartite labeled graph Locations (1 million) Vertex attributes: • (x,y,z) • land use •… Simulation Science Laboratory A Social Network: bipartite labeled graph Edge attributes: • activity type: shop, work, school • (start time 1, end time 1) • probability of transmitting Simulation Science Laboratory A Social Network: projection onto people Simulation Science Laboratory A Social Network: projection onto people [t1,t2] [t2,t3] [t3,t4] Simulation Science Laboratory [t4,t5] A Social Network: projection over time Simulation Science Laboratory Dendrogram: actual path disease takes Simulation Science Laboratory A Social Network: bipartite labeled graph Simulation Science Laboratory A Social Network: projection onto locations Simulation Science Laboratory A Social Network: projection onto locations t2 t3 t4 Simulation Science Laboratory A Social Network: projection over time Simulation Science Laboratory Disease Dynamics & Scenario Determine Relevant Projections • People projection: edge if people co-located – communicable disease + vaccination/isolation • Location projection: directed edge if travel between locations – contamination, quarantine • Time dependence: almost periodic – Important time scales set by disease dynamics: • Infectious period • Duration of contact for transmission Simulation Science Laboratory Example: Person-person graph Person-person graph (~ dendrogram with ptransmission = 1) Dendrogram with ptransmission << 1 Geographic spread Characterizing EpiSims Networks • Degree distributions • Pointwise clustering: ratio of # triangles to # possible • Assortative mixing by degree, age, … • Shortest path length distribution • Expansion Simulation Science Laboratory Degree Distribution, location-location Degree Distribution, people-people Sensitivity to parameters Sensitivity to parameters Assortative Mixing in EpiSims Graphs • Static people - people projection is assortative – by degree (~0.25) – but not as strongly by age, income, household size, … This is • Like other social networks • Unlike – technological networks, – Erdos-Renyi random graphs – Barabasi-Albert networks Simulation Science Laboratory Removing high degree people useless Removing high degree locations better Clustering coefficient vs degree Simulation Science Laboratory Characterizing Networks for Epidemiology • • Question: how to change a network to reduce [casualties]? Constraints: – – – – • Don’t know ahead of time where outbreak begins Minimize impact on other social functions of network Don’t know true network, only estimated one Incorporate dependence on pathogen properties Optimization: – Propose edge/vertex removal based on measurable (local) properties – Quickly estimate effect of new structure • How does propagation depend on structure? Simulation Science Laboratory Suggested Metric Nk(i) = Number of distinct people connected to person i by a (shortest) path of length k “k-betweenness”, “pointwise k-expansion” Important k values are related to ratio of incubation to response times Shortest path vs any path: depends on probability of transmission – Given N1(i), ..., Nk(i), can construct analog for non-shortest path of length k x Assumes static graph, but expect graph to change Simple cases incorporate intuitively important properties – For k=1, N1(i) = d(i) – For k=2, includes degree distribution, clustering, assortativity by degree Simulation Science Laboratory Comparison to “usual suspects” x Harder to measure in real networks x Difficult to work with analytically Perturbative expansions (say, around tree-like structure) are lacking a small parameter to expand in Describes how clustering should be combined with degree Degree alone determines neither vulnerability nor criticality Betweenness is global, sensitive to small changes Usual statistics don’t incorporate time scales naturally Simulation Science Laboratory Degree alone determines neither vulnerability nor criticality Same degree distribution Different assortative mixing by degree Introduce index case uniformly at random, what color (degree) is vulnerable? Top graph: degree 1, 80% of the time Bottom graph: degree 4, 80% of the time Simulation Science Laboratory Critical vertex Use depends on how disease is introduced • Introduction uniformly distributed, consider distribution over all people: mean, variance, … • Introduction concentrated on specific part of graph, consider distribution over k-neighborhood • Introduction by malicious agent, consider worst case or tail Simulation Science Laboratory Conclusion Progress on many fronts, but plenty more to be done: • Estimating large social networks • Building efficient, scalable simulations • Understanding structure of social networks • Determining how structure affects disease spread Simulation Science Laboratory