Anomaly Detection and Virus Propagation in Large Graphs Christos Faloutsos CMU

advertisement
Anomaly Detection and Virus
Propagation in Large Graphs
Christos Faloutsos
CMU
Thank you!
• Dr. Ching-Hao (Eric) Mao
• Prof. Kenneth Pao
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
2
Outline
• Part 1: anomaly detection
– OddBall (anomaly detection)
– Belief Propagation
– Conclusions
• Part 2: influence propagation
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
3
OddBall: Spotting Anomalies
in Weighted Graphs
Leman Akoglu, Mary McGlohon, Christos
Faloutsos
Carnegie Mellon University
School of Computer Science
PAKDD 2010, Hyderabad, India
Main idea
For each node,
• extract ‘ego-net’ (=1-step-away neighbors)
• Extract features (#edges, total weight, etc etc)
• Compare with the rest of the population
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
5
What is an egonet?
egonet
ego
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
6
Selected Features




Ni: number of neighbors (degree) of ego i
Ei: number of edges in egonet i
Wi: total weight of egonet i
λw,i: principal eigenvalue of the weighted adjacency matrix of
egonet I
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
7
Near-Clique/Star
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
8
Near-Clique/Star
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
9
Near-Clique/Star
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
10
Near-Clique/Star
Andrew Lewis
(director)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
11
Outline
• Part 1: anomaly detection
– OddBall (anomaly detection)
– Belief Propagation
– Conclusions
• Part 2: influence propagation
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
12
E-bay Fraud detection
w/ Polo Chau &
Shashank Pandit, CMU
[www’07]
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
13
E-bay Fraud detection
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
14
E-bay Fraud detection
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
15
E-bay Fraud detection - NetProbe
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
16
Popular press
And less desirable attention:
• E-mail from ‘Belgium police’ (‘copy of your
code?’)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
17
Outline
• OddBall (anomaly detection)
• Belief Propagation
– Ebay fraud
– Symantec malware detection
– Unification results
• Conclusions
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
18
Polonium: Tera-Scale Graph Mining
and Inference for Malware Detection
SDM 2011, Mesa, Arizona
Polo Chau
Carey Nachenberg
Machine Learning Dept Vice President & Fellow
Jeffrey Wilhelm
Principal Software Engineer
Adam Wright
Prof. Christos Faloutsos
Software Engineer
Computer Science Dept
Polonium: The Data
60+ terabytes of data anonymously
contributed by participants of worldwide
Norton Community Watch program
50+ million machines
900+ million executable files
Constructed a machine-file bipartite graph
(0.2 TB+)
1 billion nodes (machines and files)
37 billion edges
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
20
Polonium: Key Ideas
• Use Belief Propagation to propagate domain
knowledge in machine-file graph to detect
malware
• Use “guilt-by-association” (i.e., homophily)
– E.g., files that appear on machines with many bad
files are more likely to be bad
• Scalability: handles 37 billion-edge graph
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
21
Polonium: One-Interaction Results
Ideal
84.9% True Positive Rate
1% False Positive Rate
True Positive Rate
% of malware
correctly identified
Taiwan'12
False Positive Rate
% of non-malware
Faloutsos, Prakash, Chau,wrongly
Koutra, Akoglu labeled as malware
22
Outline
• Part 1: anomaly detection
– OddBall (anomaly detection)
– Belief Propagation
• Ebay fraud
• Symantec malware detection
• Unification results
– Conclusions
• Part 2: influence propagation
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
23
Unifying Guilt-by-Association
Approaches:
Theorems and Fast Algorithms
Danai Koutra
U Kang
Hsing-Kuo Kenneth Pao
Tai-You Ke
Duen Horng (Polo) Chau
Christos Faloutsos
ECML PKDD, 5-9 September 2011, Athens, Greece
Problem Definition:
GBA techniques
Given: Graph; &
few labeled nodes
Find: labels of rest
(assuming network
effects)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
25
Homophily and Heterophily
Step 1
All methods
handle homophily
BUT
Step 2
Taiwan'12
NOT all methods
handle heterophily
proposed method
does!
Faloutsos, Prakash, Chau, Koutra, Akoglu
26
Are they related?
• RWR (Random Walk with Restarts)
– google’s pageRank (‘if my friends are important,
I’m important, too’)
• SSL (Semi-supervised learning)
– minimize the differences among neighbors
• BP (Belief propagation)
– send messages to neighbors, on what you believe
about them
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
27
Are they related? YES!
• RWR (Random Walk with Restarts)
– google’s pageRank (‘if my friends are important,
I’m important, too’)
• SSL (Semi-supervised learning)
– minimize the differences among neighbors
• BP (Belief propagation)
– send messages to neighbors, on what you believe
about them
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
28
Correspondence of Methods
Method
RWR
SSL
FABP
Matrix
[I – c AD-1]
[I + a(D - A)]
[I + a D - c’A]
1
Taiwan'12
1
1
Unknown
known
×
x
= (1-c)y
×
x
=
y
×
bh
=
φh
d1
0 1 0
d2 1 0 1
?
d3 0 1 0
adjacency final
labels/
matrix
beliefs
Faloutsos, Prakash, Chau, Koutra, Akoglu
0
1
1
prior
labels/
beliefs
29
runtime (min)
Results: Scalability
# of edges (Kronecker graphs)
FABP is linear on the number of edges.
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
30
% accuracy
Results (5): Parallelism
runtime (min)
FABP ~2x faster
& wins/ties on accuracy.
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
31
Conclusions
• Anomaly detection: hand-in-hand with
pattern discovery (‘anomalies’ == ‘rare
patterns’)
• ‘OddBall’ for large graphs
• ‘NetProbe’ and belief propagation: exploit
network effects.
• FaBP: fast & accurate
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
32
Outline
• Part 1: anomaly detection
– OddBall (anomaly detection)
– Belief Propagation
– Conclusions
• Part 2: influence propagation
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
33
Influence propagation in
large graphs - theorems and
algorithms
B. Aditya Prakash
http://www.cs.cmu.edu/~badityap
Christos Faloutsos
http://www.cs.cmu.edu/~christos
Carnegie Mellon University
Networks are everywhere!
Facebook Network [2010]
Gene Regulatory Network
[Decourty 2008]
Human Disease Network
[Barabasi 2007]
The Internet [2005]
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
35
Dynamical Processes over networks
are also everywhere!
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
36
Why do we care?
• Information Diffusion
• Viral Marketing
• Epidemiology and Public Health
• Cyber Security
• Human mobility
• Games and Virtual Worlds
• Ecology
• Social Collaboration
........
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
37
Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
[AJPH 2007]
Diseases over contact networks
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
CDC data: Visualization of
the first 35 tuberculosis
(TB) patients and their
1039 contacts
38
Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
• Each circle is a hospital
• ~3000 hospitals
• More than 30,000 patients
transferred
[US-MEDICARE
NETWORK 2005]
Taiwan'12
Problem: Given k units of
disinfectant, whom to immunize?
Faloutsos, Prakash, Chau, Koutra, Akoglu
39
Why do we care? (1:
Epidemiology)
~6x
fewer!
CURRENT PRACTICE
[US-MEDICARE
NETWORK 2005]
OUR METHOD
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
40
Hospital-acquired
inf. took
99K+ lives, cost $5B+ (all per year)
Why do we care? (2: Online
Diffusion)
> 800m users, ~$1B
revenue [WSJ 2010]
~100m active users
> 50m users
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
41
Why do we care? (2: Online
Diffusion)
• Dynamical Processes over networks
Buy Versace™!
Followers
Celebrity
Social Media Marketing
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
42
High Impact – Multiple Settings
epidemic out-breaks
Q. How to squash rumors faster?
products/viruses
Q. How do opinions spread?
transmit s/w patches
Q. How to market better?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
43
Research Theme
ANALYSIS
Understanding
DATA
Large real-world
networks & processes
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
POLICY/
ACTION
Managing
44
In this talk
Given propagation models:
Q1: Will an epidemic
happen?
ANALYSIS
Understanding
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
45
In this talk
Q2: How to immunize and
control out-breaks better?
POLICY/
ACTION
Managing
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
46
Outline
• Part 1: anomaly detection
• Part 2: influence propagation
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
47
A fundamental question
Strong
Virus
Epidemic?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
48
example (static graph)
Weak Virus
Epidemic?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
49
Problem Statement
# Infected
above (epidemic)
below (extinction)
Separate the
regimes?
time
Find, a condition under which
– virus will die out exponentially quickly
– regardless of initial infection condition
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
50
Threshold (static version)
Problem Statement
• Given:
–Graph G, and
–Virus specs (attack prob. etc.)
• Find:
–A condition for virus extinction/invasion
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
51
Threshold: Why important?
•
•
•
•
Accelerating simulations
Forecasting (‘What-if’ scenarios)
Design of contagion and/or topology
A great handle to manipulate the spreading
– Immunization
– Maximize collaboration
…..
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
52
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
53
“SIR” model: life immunity
(mumps)
• Each node in the graph is in one of three states
– Susceptible (i.e. healthy)
– Infected
– Removed (i.e. can’t get infected again)
Prob. δ
t=1
Taiwan'12
t=2
Faloutsos, Prakash, Chau, Koutra, Akoglu
t=3
54
Terminology: continued
• Other virus propagation models (“VPM”)
– SIS : susceptible-infected-susceptible, flu-like
– SIRS : temporary immunity, like pertussis
– SEIR : mumps-like, with virus incubation
(E = Exposed)
….………….
• Underlying contact-network – ‘who-can-infectwhom’
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
55
Related Work















R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press,
1991.
A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex
Networks. Cambridge University Press, 2010.
F. M. Bass. A new product growth for model consumer durables. Management Science,
15(5):215–227, 1969.
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in
real networks. ACM TISSEC, 10(4), 2008.
D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly
Connected World. Cambridge University Press, 2010.
A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of
epidemics. IEEE INFOCOM, 2005.
Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free
networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003.
H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000.
H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer
Lecture Notes in Biomathematics, 46, 1984.
J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer
viruses. IEEE Computer Society Symposium on Research in Security and Privacy, 1991.
J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE
Computer Society Symposium on Research in Security and Privacy, 1993.
R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks.
Physical Review Letters 86, 14, 2001.
………
………
………
Taiwan'12
All are about either:
• Structured
topologies (cliques,
block-diagonals,
hierarchies, random)
• Specific virus
propagation models
• Static graphs
Faloutsos, Prakash, Chau, Koutra, Akoglu
56
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
57
How should the answer look
like?
• Answer should depend on:
– Graph
– Virus Propagation Model (VPM)
• But how??
– Graph – average degree? max. degree? diameter?
– VPM – which parameters?
– How to combine – linear? quadratic? exponential?
2 2
(

d avg  d avg ) / d max ? …..
d avg   diameter ?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
58
Static Graphs: Our Main Result
• Informally,
For,
 any arbitrary topology (adjacency
matrix A)
 any virus propagation model (VPM) in
standard literature
•
the
epidemic threshold depends only
1. on the λ, first eigenvalue of A, and
2. some constant CVPM , determined by
the virus propagation model
w/ Deepay
Chakrabarti
λ
CVPM
No
epidemic if
λ * CVPM < 1
In Prakash+ ICDM 2011 (Selected among best papers).
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
59
Our thresholds for some models
• s = effective strength
• s < 1 : below threshold
Models
Effective Strength
(s)
s=λ.
 
 
 
SIV, SEIV
s=λ.
  


     
SI
I
V
V
(H.I.V.)
1
2
1
2
 1v2  2 
s = λ .  v   v  
 2
1 
Faloutsos, Prakash, Chau, Koutra, Akoglu
SIS, SIR, SIRS, SEIR
Taiwan'12
Threshold (tipping
point)
s=1
60
Our result: Intuition for λ
“Official” definition:
• Let A be the adjacency
matrix. Then λ is the root
with the largest magnitude of
the characteristic polynomial
of A [det(A – xI)].
“Un-official” Intuition 
• λ ~ # paths in the graph
A
k
≈
u
k
.u
• Doesn’t give much intuition!
A
k (i, j) = # of paths i  j
of length k
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
61
Largest Eigenvalue (λ)
better connectivity
λ≈2
λ≈2
N = 1000
Taiwan'12
higher λ
λ= N
λ = N-1
λ= 31.67
λ= 999
Faloutsos, Prakash, Chau, Koutra, Akoglu
N nodes
62
Footprint
Fraction of Infections
Examples: Simulations – SIR
(mumps)
Effective Strength
Time ticks
(a) Infection profile
Taiwan'12
(b) “Take-off” plot
PORTLAND graph: synthetic population,
31 million links, 6 million nodes
Faloutsos, Prakash, Chau, Koutra, Akoglu
63
Footprint
Fraction of Infections
Examples: Simulations – SIRS
(pertusis)
Time ticks
Effective Strength
(a) Infection profile
Taiwan'12
(b) “Take-off” plot
PORTLAND graph: synthetic population,
31 million links, 6 million nodes
Faloutsos, Prakash, Chau, Koutra, Akoglu
64
See paper for
full proof
General VPM
structure
Model-based
λ * CVPM
<
1
Topology
and stability
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
Graph-based
65
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
66
See paper for
full proof
General VPM
structure
Model-based
λ * CVPM < 1
Topology and
stability
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
Graph-based
67
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
68
Dynamic Graphs: Epidemic?
Alternating behaviors
DAY
(e.g., work)
adjacency
matrix
8
8
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
69
Dynamic Graphs: Epidemic?
Alternating behaviors
NIGHT
(e.g., home)
adjacency
matrix
8
8
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
70
Model Description
Healthy
• SIS model
N2
Prob. β
– recovery rate δ
– infection rate β
N1
X
Prob. δ
Infected
N3
• Set of T arbitrary graphs
day
N
Taiwan'12
N
night
N
, weekend…..
N
Faloutsos, Prakash, Chau, Koutra, Akoglu
71
Our result: Dynamic Graphs
Threshold
• Informally, NO epidemic if
eig (S) =
Single number!
Largest eigenvalue of
The system matrix S
In Prakash+, ECML-PKDD 2010
Taiwan'12
<1
S =
Faloutsos, Prakash, Chau, Koutra, Akoglu
72
Infection-profile
log(fraction infected)
MIT Reality
Mining
Synthetic
ABOVE
ABOVE
AT
AT
BELOW
BELOW
Time
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
73
Footprint (#
infected @
“steady state”)
“Take-off” plots
Synthetic
MIT Reality
EPIDEMIC
Our
threshold
NO EPIDEMIC
Our
threshold
EPIDEMIC
NO EPIDEMIC
(log scale)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
74
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
75
Competing Contagions
iPhone v Android
Blu-ray v HD-DVD
Biological common flu/avian flu, pneumococcal inf etc
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
76
A simple model
• Modified flu-like
• Mutual Immunity (“pick one of the two”)
• Susceptible-Infected1-Infected2-Susceptible
Virus 2
Virus 1
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
77
Question: What happens in the
end?
Number of
Infections
green: virus 1
red: virus 2
Footprint @ Steady State
Footprint @ Steady State
ASSUME:
Virus 1 is stronger than Virus 2
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
= ?
78
Question: What happens in the
end? Footprint @ Steady State
Number of
Infections
green: virus 1
red: virus 2
Footprint @ Steady State
??
Strength
Strength
=
2
Strength
Strength
ASSUME:
Virus 1 is stronger than Virus 2
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
79
Answer: Winner-Takes-All
Number of
Infections
green: virus 1
red: virus 2
ASSUME:
Virus 1 is stronger than Virus 2
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
80
Our Result: Winner-Takes-All
Given our model, and any graph, the
weaker virus always dies-out completely
1. The stronger survives only if it is above threshold
2. Virus 1 is stronger than Virus 2, if:
strength(Virus 1) > strength(Virus 2)
3. Strength(Virus) = λ β / δ  same as before!
In Prakash, Beutel, + WWW 2012
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
81
Real Examples
[Google Search Trends data]
Reddit v Digg
Taiwan'12
Blu-Ray v HD-DVD
Faloutsos, Prakash, Chau, Koutra, Akoglu
82
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
83
Full Static Immunization
Given: a graph A, virus prop. model and budget k;
Find: k ‘best’ nodes for immunization (removal).
?
?
k=2
?
?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
84
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
85
Challenges
• Given a graph A, budget k,
Q1 (Metric) How to measure the ‘shieldvalue’ for a set of nodes (S)?
Q2 (Algorithm) How to find a set of k nodes
with highest ‘shield-value’?
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
86
Proposed vulnerability measure
λ
λ is the epidemic threshold
“Safe”
“Vulnerable”
“Deadly”
Increasing λ
Increasing vulnerability
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
87
A1: “Eigen-Drop”: an ideal shield
value
Eigen-Drop(S)
Δ λ = λ - λs
9
9
11
10
Δ
9
10
1
1
4
4
8
8
2
2
7
3
5
5
6
Original Graph
Taiwan'12
7
3
Faloutsos, Prakash, Chau, Koutra, Akoglu
6
Without {2, 6}
88
(Q2) - Direct Algorithm too
expensive!
• Immunize k nodes which maximize Δ λ
S = argmax Δ λ
• Combinatorial!
• Complexity:
– Example:
• 1,000 nodes, with 10,000 edges
• It takes 0.01 seconds to compute λ
• It takes 2,615 years to find 5-best nodes!
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
89
A2: Our Solution
• Part 1: Shield Value
– Carefully approximate Eigen-drop (Δ λ)
– Matrix perturbation theory
• Part 2: Algorithm
– Greedily pick best node at each step
– Near-optimal due to submodularity
• NetShield (linear complexity)
– O(nk2+m) n = # nodes; m = # edges
In Tong, Prakash+ ICDM 2010
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
90
Experiment: Immunization
quality
Log(fraction of
infected
nodes)
PageRank
Betweeness (shortest path)
Degree
Lower
is
better
Taiwan'12
NetShield
Acquaintance
Eigs (=HITS)
Time
Faloutsos, Prakash, Chau, Koutra, Akoglu
91
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
92
Fractional Immunization of Networks
B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna (M.D.), Hanghang Tong, Christos
Faloutsos
Under review
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
93
Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Another
Hospital
Hospital
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
94
Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Another
Hospital
Hospital
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
95
Fractional Asymmetric
Immunization
Problem: Given k units of disinfectant,
how to distribute them to maximize
hospitals saved?
Another
Hospital
Hospital
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
96
Our Algorithm “SMARTALLOC”
~6x
fewer!
[US-MEDICARE NETWORK 2005]
• Each circle is a hospital, ~3000 hospitals
• More than 30,000 patients transferred
CURRENT PRACTICE
Taiwan'12
SMART-ALLOC
Faloutsos, Prakash, Chau, Koutra, Akoglu
97
Wall-Clock
Time
Running Time
> 1 week
≈
> 30,000x
speed-up!
Lower
is
better
Taiwan'12
14 secs
Simulations
SMART-ALLOC
Faloutsos, Prakash, Chau, Koutra, Akoglu
98
Lower
is
better
Experiments
PENN-NETWORK
SECOND-LIFE
~5 x
Taiwan'12
K = 200
Faloutsos, Prakash, Chau, Koutra, Akoglu
~2.5 x
K = 2000
99
Acknowledgements
Funding
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
100
References
1.
2.
3.
4.
5.
6.
7.
Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya
Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.)
Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B.
Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos
Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain
Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point
(Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos
Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain
Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash,
Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon
On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina EliassiRad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia
Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission
Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko
Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under
Submission
http://www.cs.cmu.edu/~badityap/
Taiwan'12
Faloutsos, Prakash, Chau, Koutra, Akoglu
101
Propagation on Large Networks
B. Aditya Prakash
Christos Faloutsos
Analysis
Taiwan'12
Policy/Action
Faloutsos, Prakash, Chau, Koutra, Akoglu
Data
102
Download