1229Distributed Top

advertisement
Distributed Top-K Monitoring
Brian Babcock and Chris Olston
SIGMOD2003
Presented by En Tzu Wang
Introduction


Distributed top-k monitoring
Transmit all updates to single location

Cost a lot
Running example

1998 FIFA Soccer World Cup




1 billion access
30 servers, distributed 4 geographic locations
Requests are routed based on network latencies
Query


Which pages are currently the most popular across all
servers?
Which server in the cluster has the lowest current load?
Notation definition





Coordinator node: N0
Remote monitor nodes: N1, N2, …, Nm
U={O1, O2, …, On}, values {V1, V2, … , Vn}
Sequence S contains (Oi, Nj, Δ): Nj detects Δ change of
Oi
For each Nj, partial data values V1,j, V2,j, … ,Vn,j
represented the view of Nj


logical data values
Problem Definition


Tracking the top k logical data objects within
a bounded error tolerance
and
, is the approximate
top-k set iff
where
is a user-specified
approximation parameter
Overview of this approach



Updates occur, monitor nodes change their
partial data values, ensure arithmetic
constraints to be satisfied
If constrains are satisfied, OK
If not, resolution takes place between
coordinator and the trouble nodes
Overview of this approach

Without overall view, logical data values?




Vi,j is associated with an adjustment factor
for each data object Oi sum to zero


k = 1; N1: V1,1 = 9, V2,1 = 1; N2: V1,2 = 1, V2,2 = 3
but
eg
adjustment factors are assigned by the
coordinated during the resolution
Overview of this approach

To permit a degree of error up to ε, associate
adjustment factors
with N0
and
Algorithm


The coordinator maintains n(m+1) adjustment
factors, labeled corresponding to (Oi, Nj),
initially set to 0
adjustment factors invariants:



For each Oi, the corresponding adjustment factors
sum to 0:
For all pairs
Initialize the approximate top-k set
new adjustment factors, send to Nj
, set
Algorithm


For each pair
of objects
straddling ,Nj ensures
If each monitor node holds the local constraint
for each pair
Resolution


Let be the set of objects whose partial
values at Nf are involved in violated
constraints
Phase1: Nf sends a message to N0, containing
failed constraints, subset of current partial
data values, border value
Resolution


Phase2: N0 detects the participating nodes by
validation test: consider each pair
If hold,
does not invalidate
N0 performs reallocation to update the
adjustment factors
If not, go to phase3
Resolution

Phase3: N0 requests relevant partial data
values and a border value to the others nodes



Compute the new top-k set
Perform reallocation across all nodes to establish
new adjustment factors
Total 1+2(m-1)+m message
Adjustment factor reallocation

Criterion1: The new factors satisfy

Sum to 0


Criterion2: After resolution and reallocation,
all new constrains defined by with
are
satisfied
Adjustment factor reallocation




Vi,0 = 0, B0 =
For each Oi, Oi’s participating sum:
Border sum:
Allocation parameter:
Adjustment factor reallocation


The leeway of an object is measure of
the overall slack in the arithmetic constraints
involving partial values from the
participation nodes
slack: the numeric gap between the two sides
of the inequality
Adjustment factor reallocation
Conclusion
Download