Models of the Spread of Disease through Social Networks

advertisement
Challenges for Discrete Mathematics
and Theoretical Computer Science
in Homeland Security
1
Great concern about the deliberate introduction of
diseases by bioterrorists has led to new challenges
for mathematical scientists.
smallpox
2
Bioterrorism issues are typical of many homeland
security issues.
This talk will emphasize bioterrorism, but many of
the “messages” apply to homeland security in
general.
Waiting on line
to get smallpox
vaccine during
New York City
smallpox epidemic
3
Outline
1. The role of mathematical sciences in the fight for
homeland security and against bioterrorism.
2. Methods of computational and mathematical
epidemiology
2a. Other areas of mathematical sciences
2b. Discrete math and theoretical CS
3. Graph-theoretical models of spread and control
4
of disease
Dealing with bioterrorism requires detailed
planning of preventive measures and responses.
Both require precise reasoning and extensive
analysis.
5
Understanding infectious systems requires being
able to reason about highly complex biological
systems, with hundreds of demographic and
epidemiological variables.
Intuition alone is insufficient to fully understand
6
the dynamics of such systems.
Experimentation or field trials are often
prohibitively expensive or unethical and do not
always lead to fundamental understanding.
Therefore, mathematical modeling becomes an
important experimental and analytical tool.
7
Mathematical models have become important tools
in analyzing the spread and control of infectious
diseases and plans for defense against bioterrorist
attacks, especially when combined with powerful,
modern computer methods for analyzing and/or
simulating the models.
8
What Can Math Models Do For Us?
9
What Can Math Models Do For Us?
•Sharpen our understanding of fundamental
processes
•Compare alternative policies and interventions
•Help make decisions.
•Prepare responses to bioterrorist attacks.
•Provide a guide for training exercises and
scenario development.
•Guide risk assessment.
•Predict future trends.
10
What are the challenges for mathematical scientists
in the defense against disease?
This question led DIMACS, the Center for Discrete
Mathematics and Theoretical Computer Science, to
launch a “special focus” on this topic.
Post-September 11 events soon led to an emphasis
on bioterrorism.
11
DIMACS Special Focus on
Computational and Mathematical
Epidemiology 2002-2005
Anthrax
12
Methods of Math. and Comp. Epi.
Math. models of infectious diseases go back to
Daniel Bernoulli’s mathematical analysis of
smallpox in 1760.
13
Hundreds of math. models since have:
•highlighted concepts like core population in
STD’s;
14
•Made explicit concepts such as herd immunity
for vaccination policies;
15
•Led to insights about drug resistance, rate of
spread of infection, epidemic trends, effects of
different kinds of treatments.
16
The size and overwhelming complexity of modern
epidemiological problems -- and in particular the
defense against bioterrorism -- calls for new
approaches and tools.
17
The Methods of Mathematical
and Computational Epidemiology
•Statistical Methods
–long history in epidemiology
–changing due to large data sets involved
•Dynamical Systems
–model host-pathogen systems, disease spread
–difference and differential equations
–little systematic use of today’s powerful computational
methods
18
The Methods of Mathematical
and Computational Epidemiology
•Probabilistic Methods
–stochastic processes, random walks, percolation,
Markov chain Monte Carlo methods
–simulation
–need to bring in more powerful computational tools
19
Discrete Math. and Theoretical
Computer Science
• Many fields of science, in particular molecular
biology, have made extensive use of DM broadly
defined.
20
Discrete Math. and Theoretical
Computer Science Cont’d
•Especially useful have been those tools that make
use of the algorithms, models, and concepts of
TCS.
•These tools remain largely unused and unknown
in epidemiology and even mathematical
21
epidemiology.
DM and TCS Continued
•These tools are made especially relevant to
epidemiology because of:
–Geographic Information Systems
22
DM and TCS Continued
–Availability of large and disparate
computerized databases on subjects relating to
disease and the relevance of modern methods
of data mining.
23
DM and TCS Continued
–Availability of large and disparate
computerized databases on subjects relating to
disease and the relevance of modern methods
of data mining:
–Issues involve
•detection
•surveillance (monitoring)
•streaming data analysis
•clustering
•visualization of data
24
DM and TCS Continued
–The increasing importance of an evolutionary
point of view in epidemiology and the relevance
of DM/TCS methods of phylogenetic tree
reconstruction.
25
DM and TCS Continued
–The increasing importance of an evolutionary
point of view in epidemiology and the relevance
of DM/TCS methods of phylogenetic tree
reconstruction.
•Heavy use of DM in phylogenetic tree
reconstruction
•Might help in identification of source of an
infectious agent
26
A Sampling of What is
Happening at DIMACS
“Working Group” on Mathematical Sciences
Challenges in Defense Against Bioterrorism
Working Group on Disease Surveillance and
Detection
Working Group on Vaccination Strategies
Computer Security: W.G. on Analogies between
27
Computer Viruses and Biological Viruses
A Sampling of What is
Happening at DIMACS
Research Project on Monitoring Message Streams
Research Project on Sharing Information between
Databases
Special Focus on Communications Security
Special Focus on Computational and Mathematical
Epidemiology
28
Models of the Spread and Control of
Disease through Social Networks
•Diseases are spread through social networks.
•This is especially relevant to sexually transmitted
diseases such as AIDS.
•“Contact tracing” is an important part of any
strategy to combat outbreaks of diseases such as
smallpox, whether naturally occurring or resulting
from bioterrorist attacks.
29
The Basic Model
Social Network = Graph
Vertices = People
Edges = contact
State of a Vertex:
simplest model: 1 if infected, 0 if not infected
(SI Model)
More complex models: SI, SEI, SEIR, etc.
S = susceptible, E = exposed, I = infected, R =
recovered (or removed)
30
More About States
Once you are infected, can you be cured?
If you are cured, do you become immune or can
you re-enter the infected state?
We can build a digraph reflecting the possible ways
to move from state to state in the model.
31
The State Diagram for a Smallpox
Model
The following diagram is from a Kaplan-CraftWein (2002) model for comparing alternative
responses to a smallpox attack. This is being
considered by the CDC and Office of Homeland
Security.
32
33
The Stages
Row 1: “Untraced” and in various stages of
susceptibility or infectiousness.
Row 2: Traced and in various stages of the queue
for vaccination.
Row 3: Unsuccessfully vaccinated and in various
stages of infectiousness.
Row 4: Successfully vaccinated; dead
34
Moving From State to State
Let si(t) give the state of vertex i at time t.
Two states 0 and 1.
Times are discrete: t = 0, 1, 2, …
35
Majority Processes
Basic Majority Process: You change your state at
time t+1 if a majority of your neighbors have the
opposite state at time t.
(No change in case of “ties”)
Useful in models of spread of opinion.
Disease interpretation? Cure if majority of your
neighbors are uninfected. Does this make sense?
36
Majority Processes II
Irreversible Majority Process: You change your
state from 0 to 1 at time t+1 if a majority of
your neighbors have state 1 at time t. You never
leave state 1.
(No change in case of “ties”)
Disease interpretation? Infected if sufficiently
many of your neighbors are infected.
37
Basic Majority Process
38
39
40
Irreversible Majority Process
41
42
43
Aside: Distributed Computing
Majority processes are studied in distributed
computing.
Goal: Eliminate damage caused by failed
processors (vertices) or at least to restrict their
influence.
Do this by maintaining replicated copies of crucial
data and, when a fault occurs, letting a processor
change “state” if a majority of its neighbors are in a
different state.
Other applications of similar ideas in distributed
computing: distributed database management,
44
quorum systems, fault local mending.
Threshold Processes
Basic k-Threshold Process: You change your state
at time t+1 if at least k of your neighbors have
the opposite state at time t.
Disease interpretation? Same issue as basic
majority processes.
45
Threshold Processes II
Irreversible k-Threshold Process: You change
your state from 0 to 1 at time t+1 if at least k
of your neighbors have state 1 at time t. You
never leave state 1.
Disease interpretation? Infected if sufficiently
many of your neighbors are infected.
Special Case k = 1: Infected if any of your
neighbors is infected.
46
Basic 2-Threshold Process
47
48
49
Irreversible 2-Threshold Process
50
51
52
Complications to Add to Model
•k = 1, but you only get infected with a certain
probability.
•You are automatically cured after you are in the
infected state for d time periods.
•You become immune from infection (can’t reenter state 1) once you enter and leave state 1.
•A public health authority has the ability to
“vaccinate” a certain number of vertices, making
53
them immune from infection.
Periodicity
State vector: s(t) = (s1(t), s2(t), …, sn(t)).
First example, s(1) = s(3) = s(5) = …,
s(0) = s(2) = s(4) = s(6) = …
Second example: s(1) = s(2) = s(3) = ...
In all of these processes, because there is a finite
set of vertices, for any initial state vector s(0), the
state vector will eventually become periodic, i.e.,
for some P and T, s(t+P) = s(t) for all t > T.
The smallest such P is called the period.
54
Periodicity II
First example: the period is 2.
Second example: the period is 1.
Both basic and irreversible majority processes and
threshold processes are special cases of symmetric
synchronous neural networks.
Theorem (Goles and Olivos, Poljak and Sura): For
symmetric, synchronous neural networks, the
55
period is either 1 or 2.
Periodicity III
When period is 1, we call the ultimate state vector
a fixed point.
When the fixed point is the vector s(t) = (1,1,…,1)
or (0,0,…,0), we talk about a final common state.
One problem of interest: Given a graph, what
subsets S of the vertices can force one of our
processes to a final common state with entries
equal to the state shared by all the vertices in S in
the initial state?
56
Periodicity IV
Interpretation: Given a graph, what subsets S of
the vertices should we plant a disease with so that
ultimately everyone will get it? (s(t)  (1,1,…,1))
Economic interpretation: What set of people do we
place a new product with to guarantee “saturation”
of the product in the population?
Interpretation: Given a graph, what subsets S of
the vertices should we vaccinate to guarantee that
ultimately everyone will end up without the
57
disease? (s(t)  0,0,…,0))
Conversion Sets
Conversion set: Subset S of the vertices that can
force one of our processes to a final common state
with entries equal to the state shared by all the
vertices in S in the initial state. (In other words, if
all vertices of S start in same state x = 1 or 0, then
the process goes to a state where all vertices are in
state x.)
Irreversible conversion set if irreversible process.
k-conversion set or irreversible k-conversion set if
58
a k-threshold process.
1-Conversion Sets
k = 1.
What are the conversion sets in a basic 1-threshold
process?
59
1-Conversion Sets
k = 1.
The only conversion set in a basic 1-threshold
process is the set of all vertices. For, if any two
adjacent vertices have 0 and 1 in the initial state,
then they keep switching between 0 and 1 forever.
What are the irreversible 1-conversion sets?
60
Irreversible 1-Conversion Sets
k = 1.
Every single vertex x is an irreversible 1conversion set if the graph is connected. We make
it 1 and eventually all vertices become 1 by
following paths from x.
61
Conversion Sets for Odd Cycles
C2p+1
Majority process.
What is a conversion set?
62
Conversion Sets for Odd Cycles
C2p+1.
Majority process.
Place p+1 1’s in “alternating” positions.
63
64
65
Conversion Sets for Odd Cycles
We have to be careful where we put the initial 1’s.
p+1 1’s do not suffice if they are next to each
other.
66
67
68
Conversion Sets for Odd Cycles
k-threshold process.
k = 2: This is the same as a majority process.
k = 3: Nothing ever changes.
69
Irreversible Conversion Sets for Odd
Cycles
What if we want an irreversible conversion set
under the majority process?
Same set of p+1 vertices is an irreversible
conversion set. Moreover, everyone gets infected in
one step.
70
Vaccination Strategies
If you didn’t know whom a bioterrorist might
infect, what people would you vaccinate to be sure
that a disease doesn’t spread very much?
(Vaccinated vertices stay at state 0 regardless of
the state of their neighbors.)
Try odd cycles again. Consider an irreversible 2threshold process. Suppose your adversary has
enough supply to infect two individuals.
Strategy 1: “Mass vaccination”: make everyone 0
71
and immune in initial state.
Vaccination Strategies
In C5, mass vaccination means vaccinate 5
vertices. This obviously works.
In practice, vaccination is only effective with a
certain probability, so results could be different.
Can we do better than mass vaccination?
What does better mean? If vaccine has no cost and
is unlimited and has no side effects, of course we
72
use mass vaccination.
Vaccination Strategies
What if vaccine is in limited supply? Suppose we
only have enough vaccine to vaccinate 2 vertices.
Consider two different vaccination strategies:
Vaccination Strategy I
Vaccination Strategy II
73
Vaccination Strategy I: Worst Case
(Adversary Infects Two)
Two Strategies for Adversary
Adversary Strategy Ia
Adversary Strategy Ib
74
The “alternation” between your choice of a
defensive strategy and your adversary’s choice
of an offensive strategy suggests we consider
the problem from the point of view of game
theory.
The Food and Drug Administration is studying
the use of game-theoretic models in the
defense against bioterrorism.
75
Vaccination Strategy I
Adversary Strategy Ia
76
Vaccination Strategy I
Adversary Strategy Ib
77
Vaccination Strategy II: Worst Case
(Adversary Infects Two)
Two Strategies for Adversary
Adversary Strategy IIa
Adversary Strategy IIb
78
Vaccination Strategy II
Adversary Strategy IIa
79
Vaccination Strategy II
Adversary Strategy IIb
80
Conclusions about Strategies I
and II
If you can only vaccinate two individuals:
Vaccination Strategy II never leads to more than
two infected individuals, while Vaccination
Strategy I sometimes leads to three infected
individuals (depending upon strategy used by
adversary).
Thus, Vaccination Strategy II is better.
81
k-Conversion Sets
k-conversion sets are complex.
Consider the graph K4 x K2.
82
k-Conversion Sets II
Exercise: (a). The vertices a, b, c, d, e form a 2conversion set.
(b). However, the vertices a,b,c,d,e,f do not.
Interpretation: Immunizing one more person can be
worse! (Planting a disease with one more person
can be worse if you want to infect everyone.)
Note: the same does not hold true for irreversible
k-conversion sets.
83
NP-Completeness
Problem: Given a positive integer d and a graph
G, does G have a k-conversion set of size at
most d?
Theorem (Dreyer 2000): This problem is NPcomplete for fixed k > 2.
(Whether or not it is NP-complete for k = 2
remains open.)
Same conclusions for irreversible k-conversion set.
84
k-Conversion Sets in Regular Graphs
G is r-regular if every vertex has degree r.
Theorem (Dreyer 2000): Let G = (V,E) be a
connected r-regular graph and D be a set of
vertices.
(a). D is an irreversible r-conversion set iff V-D
is an independent set.
(b). D is an r-conversion set iff V-D is an
independent set and D is not an independent set.
85
k-Conversion Sets in Regular Graphs
II
Corollary (Dreyer 2000):
(a). The size of the smallest irreversible 2conversion set in Cn is ceiling[n/2].
(b). The size of the smallest 2-conversion set in Cn
is ceiling[(n+1)/2].
ceiling[x] = smallest integer at least as big as x.
This result agrees with our observation.
86
k-Conversion Sets in Regular Graphs
III
Proof:
(a). Cn is 2-regular. The largest independent set
has size floor[n/2], where floor[x] = largest integer
no bigger than x. Thus, the smallest D so that
V-D is independent has size ceiling[n/2].
(b). If n is odd, taking the first, third, …, nth
vertices around the cycle gives a set that is not
independent and whose complement is
independent. If n is even, every vertex set of size
n/2 with an independent complement is itself
independent, so an additional vertex is needed. 87
k-Conversion Sets in Trees
The simplest case is when every internal vertex of
the tree has degree > k.
Leaf = vertex of degree 1; internal vertex = not a
leaf.
Can you guess a 2-conversion set here?
88
k-Conversion Sets in Trees
Can you guess a 2-conversion set here?
All the leaves have to be in it. This will suffice.
89
90
91
k-Conversion Sets in Trees
Theorem (Dreyer 2000): Let T be a tree and every
internal vertex have degree > k, where k > 1. Then
the smallest k-conversion set and the smallest
irreversible conversion set have size equal to the
number of leaves of the tree.
92
k-Conversion Sets in Grids
Let G(m,n) be the rectangular grid graph with m
rows and n columns.
G(3,4)
93
Toroidal Grids
The toroidal grid T(m,n) is obtained from the
rectangular grid G(m,n) by adding edges from the
first vertex in each row to the last and from the first
vertex in each column to the last.
Toroidal grids are easier to deal with than
rectangular grids because they form regular graphs:
Every vertex has degree 4. Thus, we can make use
of the results about regular graphs.
94
T(3,4)
95
4-Conversion Sets in Toroidal Grids
Theorem (Dreyer 2000): In a toroidal grid T(m,n)
(a). The size of the smallest 4-conversion set is
max{n(ceiling[m/2]), m(ceiling[n/2])} m or n odd
{
mn/2 + 1
m, n even
(b). The size of the smallest irreversible 4conversion set is as above when m or n is 96
odd, and it is mn/2 when m and n are even.
Part of the Proof: Recall that D is an irreversible
4-conversion set in a 4-regular graph iff V-D is
independent.
V-D independent means that every edge {u,v} in
G has u or v in D. In particular, the ith row
must contain at least ceiling[n/2] vertices in D and
the ith column at least ceiling[m/2] vertices in D
(alternating starting with the end vertex of the row
or column).
We must cover all rows and all columns, and so
need at least max{n(ceiling[m/2]), m(ceiling[n/2])}
97
vertices in an irreversible 4-conversion set.
4-Conversion Sets for Rectangular
Grids
More complicated methods give:
Theorem (Dreyer 2000): The size of the smallest 4conversion set and smallest irreversible 4conversion set in a grid graph G(m,n) is
2m + 2n - 4 + floor[(m-2)(n-2)/2]
98
4-Conversion Sets for Rectangular
Grids
Consider G(3,3):
2m + 2n - 4 + floor[(m-2)(n-2)/2] = 8.
What is a smallest 4-conversion set and why 8? 99
4-Conversion Sets for Rectangular
Grids
Consider G(3,3):
2m + 2n - 4 + floor[(m-2)(n-2)/2] = 8.
What is a smallest 4-conversion set and why 8?
All boundary vertices have degree < 4 and so must
be included in any 4-conversion set. They give 100
a conversion set.
More Realistic Models
Many oversimplifications. For instance:
•What if you stay infected only a certain number of
days?
•What if you are not necessarily infective for the
first few days you are sick?
•What if your threshold k for changes from 0 to 1
changes depending upon how long you have been
101
uninfected?
Alternative Models to Explore
Consider an irreversible process in which you stay
in the infected state (state 1) for d time periods
after entering it and then go back to the uninfected
state (state 0).
Consider a k-threshold process in which we
vaccinate a person in state 0 once k-1 neighbors are
infected (in state 1).
Etc. -- let your imagination roam free ...
102
More Realistic Models
Our models are deterministic. How do probabilities
enter?
•What if you only get infected with a certain
probability if you meet an infected person?
•What if vaccines only work with a certain
probability?
•What if the amount of time you remain infective
exhibits a probability distribution?
103
Alternative Model to Explore
Consider an irreversible 1-threshold process in
which you stay infected for d time periods and
then enter the uninfected state.
Assume that you get infected with probability p if
at least one of your neighbors is infected.
What is the probability that an epidemic will end
with no one infected?
104
The Case d = 2, p = 1/2
Consider the following initial state:
105
The Case d = 2, p = 1/2
With probability 1/2, vertex a does not get infected
at time 1.
Similarly for vertex b.
Thus, with probability 1/4, we stay in the same
states at time 1.
106
The Case d = 2, p = 1/2
Suppose vertices are still in same states at time 1 as
they were at time 0. With probability 1/2, vertex a
does not get infected at time 2.
Similarly for vertex b.
Also after time 1, vertices c and d have been
infected for two time periods and thus enter the
uninfected state.
Thus, with probability 1/4, we get to the following
107
state at time 2:
108
The Case d = 2, p = 1/2
Thus, with probability 1/4 x 1/4 = 1/16, we enter
this state with no one infected at time 2.
However, we might enter this state at a later time.
It is not hard to show (using the theory of finite
Markov chains) that we will end in state (0,0,0,0).
(This is the only absorbing state in an absorbing
Markov chain.). Thus: with probability 1 we will
eventually kill the disease off entirely.
109
The Case d = 2, p = 1/2
Is this realistic? What might we do to modify the
model to make it more realistic?
110
How do we Analyze this or More
Complex Models for Graphs?
Computer simulation is an important tool.
Example: At the Johns Hopkins University and the
Brookings Institution, Donald Burke and Joshua
Epstein have developed a simple model for a
region with two towns totalling 800 people. It
involves a few more probabilistic assumptions than
ours. They use single simulations as a learning
device. They also run large numbers of simulations
111
and look at averages of outcomes.
How do we Analyze this or More
Complex Models for Graphs?
Burke and Epstein are using the model to do “what
if” experiments:
What if we adopt a particular vaccination strategy?
What happens if we try different plans for
quarantining infectious individuals?
There is much more analysis of a similar nature
112
that can be done with graph-theoretical models.
Would Graph Theory help with a
deliberate outbreak of Anthrax?
113
What about a deliberate release of
smallpox?
114
Similar approaches, using mathematical models
based in DM/TCS, have proven useful in many
other fields, to:
•make policy
•plan operations
•analyze risk
•compare interventions
•identify the cause of observed events
115
Why shouldn’t these approaches work in the
defense against bioterrorism?
116
Download