Generating Uncertain Networks based on Historical Network Snapshots Meng Han , Mingyuan Yan

advertisement
Generating Uncertain Networks based on
Historical Network Snapshots
Meng Han1, Mingyuan Yan1,
Jinbao Li2, Shouling Ji1 and Yingshu Li1,2
1 Department of Computer Science, Georgia State University
2 Department of Computer Science and Technology, Heilongjiang University
Workshop on Computational Social Networks
(CSoNet 2013)
OUTLINE
Background
Problem Definition
Algorithms and Theoretical Analysis
Experimental Evaluation
Conclusion
2
Various Networks
Internet
Molecule Structure
molecule
Social Network
Communication Network
3
Protein-protein interaction Network
Co-author Network
Uncertainty Exists Everywhere
Large number of uncertain networks exist in real life.
 Protein-protein interaction Network
 Topological structure of wireless sensor network
The probability of
protein interaction
TIF34
0.75
FET3
0.95
0.651
SMT3
0.88
NTG1
RAD59
The probability of
success
communication
0.92
0.69
4
RPC40
Challenges
1. How to model and define real life uncertainty?
 No representative uncertain data set
2. Expensive to manage and mine network uncertainty
 Structural data much harder to manipulate
3. Difficult to decide relationships among nodes
 Affected by many factors
5
Contributions
 Model uncertainty
 Approximate the dynamic feature of a network by a static
model endowed with some additional features.
 Lower computation cost of managing and mining
uncertainty
 Employ some sampling techniques
 Detect relationship in uncertain networks
 Serve as a framework for measuring the expected number
of common neighbors in uncertain graphs
6
OUTLINE
Background
Problem Definition
Algorithms and Theoretical Analysis
Experimental Evaluation
Conclusion
7
Problem Definition
 Generating uncertain networks based on historical
network snapshots.
8
OUTLINE
Background
Problem Definition
Algorithms and Theoretical Analysis
Experimental Evaluation
Conclusion
9




(M1) Constant model
(M2) Linear model
(M3) Log model
(M4) Exponential model
Function assigning weight to
each snapshot
Existence probability
assigned to edge e
10
 For uncertain graph G, there are 2|E| possible worlds Ii
(1 ≤ i ≤ 2|E|).
11
12
 Common neighbors are very important
13
14
 To enumerate all the possible worlds generated from
an uncertain graph G is a #P-complete problem [6].
– Cannot enumerate all the possible worlds to calculate
Endistance(u, v) in an uncertain graph!
15
 If we sample at least
possible worlds, we can guarantee that:
Upper bound of relative error
Failure probability
16
OUTLINE
Background
Problem Definition
Algorithms and Theoretical Analysis
Experimental Evaluation
Conclusion
17
Dataset
 One typical dataset from SNAP (Stanford Large
Network Dataset Collection) was used to evaluate our
algorithm.
18
Effectiveness Evaluation
19
Efficiency Evaluation
 Even for an extremely strict requirement of correctness, e.g., =0.005 and
=0.005, the sampling number is only 5.98*1012 (whole sampling space 25000)
 If the sampling number is only 1.4*1010, it still can be guaranteed that <0.08 and
<0.05
20
OUTLINE
Background
Problem Definition
Algorithms and Theoretical Analysis
Experimental Evaluation
Conclusion
21
 A framework for generating uncertain
networks based on historical network
snapshots.
 Two uncertainty construction models to
capture uncertainty from dynamic snapshots,
and sampling techniques to improve the
efficiency of the algorithm.
22
Thanks
Q &A
23
Download