qual2 - Berkeley Database Research

advertisement
optimizing heterogeneous
networks
Qualifying Exam
UC Berkeley
16 October 2007
David Chu
Computer Science Division
EECS Department
UC Berkeley
can a declarative approach
help us cope with
heterogeneous networks?
optimizing for
adaptability and
competing
incentives
optimizing
heterogeneous
networks
dsn: declarative
sensor networks
(chu-sensys07)
Xsdlib: sensor data
library
(chu-spots06)
Xken: approximate
in the field: does
it help?
data collection
(chu-icde06)
completed
ongoing
future
3
outline
• declarative sensor networks
• motivation
• language & example programs
• feasibility assessment
• optimizing heterogeneous networks
• optimizing for adaptability and competing incentives
• deployment application
4
heterogeneous networks
• … are proliferating
• more demands on expert engineers
• sensornets as one example…
5
context
Sensor Networks
early experiences
6
motivation
programming sensor networks is difficult
building entire sensor systems is even harder
7
inspiration
sensor networks
data management
network design
8
inspiration : data management
declarative is widely used in data
management
•
•
•
•
relational databases
spreadsheets
abstract “what” from “how”
(Sensor-Network-As-Database)
•
•
[MFH05], [YG02]
9
inspiration : network design
declarative is new idea in networking
•
•
•
•
•
•
compact
flexible
analyzable, optimizable
Internet Routing, Overlays built declaratively
(the P2 project)
10
inspiration
sensor networks
data management
( DSN )
network design
11
what we did
• adapted declarative language
• wrote declarative examples
• built compiler & runtime for sensornets
12
brief language overview
head
bodyOne
bodyTwo
Rule1:
implies
join
don’t care
Rule2:
Fact:
Built-ins: builtin(temperature, ‘TemperatureImplementingModule.c’).
13
a full example : tree
S
S
Z
C1
C2
C
D
D
14
working programs
tracking
geographic routing
localization
tree routing
multi-hop collection
link estimator
trickle, in-network data aggregation, beacon vector routing,
pathDCS, convex hull fallback routing, right-hand rule fallback routing, …
15
evaluating gossip dissemination
source
*borrowed picture
… from designer’s paper [LPC04]
… DSN specification
10x6 topology
30x2 topology
19
evaluating tree-collection
1. tree
construction
2. collection
messages flow
toward root
*borrowed picture
evaluating tree-collection
messages sent
(similar performance)
hop-counts
21
lines of code
22
compiled size
TelosB mote code space = 48KB, data space = 10KB
23
live tracking demo (vldb06)
24
architectural flexibility
• dsn can…
• describe entire system stack
• application + network + mac layers
• naturally expose abstractions
• freely mix and match with outside libraries
25
thoughts up to now…
• sensor networks
→ data + communication
• several examples of functional programs
• feasible for today’s hardware platforms
• can explore variety of architectures…
26
[current work]
optimizing heterogeneous networks
27
outline
• declarative sensor networks
• optimizing heterogeneous networks
• motivation
• dimensions for optimization
• execution space by example
• finding optimal
• optimizing for adaptability and competing incentives
• deployment application
28
the protocol problem
• above: workloads vary
• below: greater diversity of networks
• who is going to figure out the best protocol for your
situation?
• protocol optimization has a long history [CDO97],
[VLC88], [HA89],…
29
implications of declarative
• concise “what” programming

• 1 specification,
N possible execution plans
?
30
Network
protocol
specification
Optimized
execution
strategy
Planner
Environment
- Workload
- Network
characteristics
Cost
criteria
31
manually engineering solutions
•
packet filtering
•
•
content distribution networks
•
•
choose tradeoff between node state or packet state
design
parameters:
state,
rendezvous?
routing
•
•
explicitly choose stateless variants when critical
quality of service
•
•
manual configuration of distribution points
session state
•
•
offload to resource-rich gateways
domain-specific protocols e.g. DSR, 6lowpan
distributed event detection
•
configure detector placement
32
toward automated solutions
• identify possible executions of program
• search execution space for optimal plan
Planner
• but first some rough idea of what we’re
after…
34
packet filter program
% forwarding rule
message(@Next,Src,Dest,Load) :message(@Crt,Src,Dest,Load), nexthop(@Crt,Dest,Next).
recursive relation
base relation
% firewall rule
receive(@Crt,Src,Dest,Load) :- message(@Crt,Src,Dest,Load),
Crt == Dest, firewall(@Crt,Load).
non-recursive relation
% query
receive(@Crt,Src,Dest,Load)?
35
nexthop
f
d
f
nexthop
e
d
e
nexthop
d
b
nexthop
b
a
d
g
b
c
a
nexthop
d
b
nexthop
g
c
base relation as table partitioned across nodes
36
f
e
d
g
b
a
c
base relation as network graph
37
receive
f
message
firewall
e
d
g
b
a
c
39
A
M
B
msg() facts
M
N
D
M
N
E
M
N
nexthop() facts
40
A
r
f
B
D
E
41
A
B
r
f
D
E
42
message
f
message
e
message
receive
9
d
8
g
1
9
c
firewall
b
2
10
a
varying join selectivity shifts the optimal rendezvous
1: Meet-In-The-Middle (MiM) Rewrite
applications: packet filtering, event detection, CDNs
43
HiResVideo
EventDetectorFlag
receive
mess
+
age
=
f
message
firewall
e
d
g
b
a
c
2: Query Scramble Rewrite
applications: query scrambling, mitigating latency X bandwidth
44
receive
f
message
firewall
e
d
g
b
=
fire
+
wall
a
c
allowed
ports
allowed
options
3: Semijoin Rewrite
application: semijoin
45
receive
f
message
firewall
e
d
g
b
,
connections
a
c
Two separate relations
at endpoint.
3b: Traditional Join Placement
application: semijoin
46
f
e
d
g
b
a
c
47
nexthop
f
d
nexthop
e
d
e
nexthop
dnexthop
b
bd
ab
f
nexthop
b
a
d
g
nexthop
nexthop
g
c
g
c
d
b
b
a
b
c
a
nexthop
nexthop
d
b
d
b
b
a
4: Pullback Rewrite
application: alternate between DVR and SR
49
nexthop
d
b
b
a
f
message
e
d
g
b
a
c
nexthop
g
c
d
b
b
a
4: Pullback Rewrite
application: alternate between DVR and SR
50
routing layer state proxying
distance
vector
routing
source
route
to D
C: D via D
C: D via D
A: D via B
B: D via C
C
A
B
D
Sensornet
nexthop forwarding table
Internet
4: Pullback Rewrite example
51
regular
L
L’
stateless
proxy
C
C
C
M
M
B
M’
A
M,L
M,L
B
M’,L’
M,L
L
M’,L’
L’
A
M
B
A
M’
5: SessionState Rewrite
52
M’
…
B
M’
L’
L
C
M
B
M
M
A
5: SessionState Rewrite
application: pushing
state into packets/proxies
53
M’,L’
…
B
M’,L’
C
M,L
B
M,L
M,L
L0
A
5: SessionState Rewrite
application: pushing
state into packets/proxies
54
M’
…
L’
defer write
B
M’,L’
prefetch read
C
M,L
M,L
L
B
M
M
A
5: SessionState Rewrite
application: pushing
state into packets/proxies
55
summary of rewrite family
56
example from dsn paper:
before rewrite
1
2
% Sample temperature and initiate multihop send
transmit (@Src, Temperature ) :− thermometer (@Src, Temperature ) , timer
(@Src, collectionTimer , Period ) .
% Prepare message for multihop transmission
message (@Src, Src , Dst , Data ) :− transmit (@Src, Data ) , nextHop (@Src,
Dst , Next , Cost ) ˜ .
recursive relation
3
% Forward message to next hop parent
message (@Next, Src , Dst , Data ) :− message (@Crt , Src , Dst , Data ) ,
nextHop (@Crt , Dst , Next , Cost ) ˜ , Cr t != Dst .
base relation
4
% Receive when at destination
alert(@Crt , Src , Data ) :− message (@Crt , Src , Dst , Data ) , Crt==Dst,
anomaly(@Crt, Thresh), Data > Thresh.
non-recursive relation
57
message(),nexthop(),nexthop(),…, nexthop(),…, nexthop(),anomaly()
alert()
rendezvous()
58
example from DSN paper:
after rewrite
1
% Sample temperature and initiate multihop send
transmit (@Src, Temperature ) :− thermometer (@Src, Temperature ) , timer (@Src, collectionTimer , Period ) .
2
% Prepare message for multihop transmission
message (@Src, Src , Dst , Data ) :− transmit (@Src, Data ) , nextHop (@Src, Dst , Next , Cost ) ˜ .
3
% Forward message to next hop parent
message (@Next, Src , Dst , Data ) :− message (@Crt , Src , Dst , Data ) , nextHop (@Crt, Dst , Next , Cost ) ˜ , Cr t != Dst,
-rendezvous’(@Crt).
rendezvous
4
% Receive when at destination optimal (middle) point
alert’(@Crt , Crt_, Src , Data ) :− message (@Crt , Src , Dst , Data ) , Crt==Dst, anomaly’(@Crt, Crt_, Thresh_), Data >
Thresh, rendezvous (@Crt).
rendezvous
5
6
% Move anomaly’() backward
rendezvous
anomaly’(@Crt, Crt, Thresh) :- anomaly(@Crt,Thresh).
anomaly’(@Prev, Crt, Thresh) :- anomaly(@Crt, Crt_, Thresh), nextHop(@Prev, Crt_, Crt, Cost), -rendezvous (@Crt).
% Auxilliary rules to move alert’() to alert()
alert’(@Next, Crt_, Src, Data) :- alert’(Crt, Crt_, Src, Data), nextHop(@Crt, Crt_, Next, Cost).
alert(@Crt, Src, Data) :- alert’(@Crt, Crt, Src, Data).
59
finding optimal
• goal: generate optimal rendezvous facts
• get statistics for both workload and resources
• synopses: well-understood database summarization
technique
• relevant statistics: join selectivity, size of tables, nexthop
distribution, link costs, storage costs
• run optimization
• currently custom optimizer algorithms for each rewrite
(rewrites are still general)
• should use general purpose dynamic programming
optimizer
• in both cases, algorithms are exhaustive
60
pullback rewrite: early results
simulation
35 nodes
1 sender
30 receivers
1000 e2e sends
utilize available
resource
account for
workload
DVR inactive until
enough table space for
all receivers.
61
planned experiment 1
MiM for event detection
• setup
• 30 node sensor network
• all nodes send samples via collection tree to root
• root maintains list of anomaly thresholds
• hypothesis
• MiM optimization lowers total energy
consumption by identifying optimal tree depth to
which to push down anomaly thresholds
62
planned experiment 2
SessionState for web server
• setup
• 1 server, X clients in P2
• each client connects with server with stateful
protocol
• hypothesis
• SessionState optimization increases the number
of clients that can connect at overall lower
messaging cost by giving “stateful” priority to
more frequently connecting clients
63
status of current work
• captured design/execution space:
• rendezvous and state
• identified ways to expose possible executions
• MiM Family of Rewrites
• implemented first cut planner
• todo
•
•
•
•
implement generic rewriter
implement generic optimizer
prove semantic invariance w.r.t. queried relation
evaluate planner on test scenarios
64
[future work]
optimizing for adaptability and competing incentives
65
outline
• declarative sensor networks
• optimizing heterogeneous networks
• optimizing for adaptability and competing incentives
• why adaptability?
• potential optimization for adaptability
• why competing incentives?
• potential optimization for competing incentives
• deployment application
66
optimize for best case
performance?
• environment changes quickly
optimizer run infrequently
• partially observable environment
e.g. imperfect synopses
• delayed visibility of environment
• adaptability as theme of DB research in last
decade [DIR07], [KD98], [IFF+99], [AH00]
67
options for adaptability
• iterating traditional optimizer more often [KD98]
• PRO: natural port of algorithms and semantics
• CON: state may still be obscured
• CON: synopsis collection overhead increases
• run-time adaptability [IFF+99], [AH00]
• PRO: continuous distributed planning
• CON: unclear what global semantics are achieved
68
adaptive rewrite for latency
1
% Sample temperature and initiate multihop send
transmit (@Src, Temperature ) :− thermometer (@Src, Temperature ) , timer (@Src, collectionTimer , Period ) .
2
% Prepare message for multihop transmission
message (@Src, Src , Dst , Data ) :− transmit (@Src, Data ) , nextHop (@Src, Dst , Next , Cost ) ˜ .
3
% Forward message to next hop parent
message (@Next, Src , Dst , Data ) :− message (@Crt , Src , Dst , Data ) , nextHop (@Crt, Dst , Next , Cost ) ˜ , Cr t != Dst,
-anomaly’(@Crt, Crt_, Thresh_).
dynamic rendezvous
4
5
% Receive when at destination optimal (middle) point
alert’(@Crt , Crt_, Src , Data ) :− message (@Crt , Src , Dst , Data ) , Crt==Dst, anomaly’(@Crt, Crt_, Thresh_), Data >
Thresh.
% Move anomaly’() backward
anomaly’(@Crt, Crt, Thresh) :- anomaly(@Crt,Thresh).
anomaly’(@Prev, Crt, Thresh) :- anomaly(@Crt, Crt_, Thresh), nextHop(@Prev, Crt_, Crt, Cost),
-message(@Crt, Src, Crt_, Data).
dynamic rendezvous
6
% Auxilliary rules to move alert’() to alert()
alert’(@Next, Crt_, Src, Data) :- alert’(Crt, Crt_, Src, Data), nextHop(@Crt, Crt_, Next, Cost).
alert(@Crt, Src, Data) :- alert’(@Crt, Crt, Src, Data).
69
global query plan
r
J2
J1
J3
m
f
n
70
derivation graph
m
n
r
f
no unique model
(multiple possible
outcomes)
is well-known
Original derivation
Augmented derivation
Dashed represents negation
71
potential experiment 1
• setup
• 30 node event detection sensor network (similar to
previously planned experiment)
• use MiM rewrite in both “best-case” and “adaptible”
mode to find optimal rendezvous
• hypothesis
• if anomaly list does not change frequently, “bestcase” performs better
• if anomaly list does change frequently, “adaptible”
performs better
72
potential experiment 2
• setup
• X clients, 1 server (similar to previously planned
experiment)
• relax synopsis accuracy assumptions e.g. most
frequent clients (table distribution) not
accurately reported
• hypothesis
• as accuracy of synopses decrease, “adaptible”
performs better than “best-case”
73
optimizing for competing
incentives
• assumed single administrative domain
• sensornets
• corporate CDNs
• individual ASs
• competing incentives
• privately-owned servers & hosts
• ISP peering relationships
• multiple sensornet queries
• individuals have incentive to deviate from
global optimal [MRW+04], [SDK+94]
74
receive
e
PINRELATION(connections)
f
message
9
d
8
g
1
9
c
firewall
b
2
10
,
connections
a
Two separate relations
at endpoint.
3: Semijoin Rewrite
75
private plan 1
message
e
Domain 1
f
9
d
8
g
1
9
c
b
2
10
a
Domain 2
private plan 2
76
[application]
large scale and fine-grained
debris flow monitoring
77
outline
• declarative sensor networks
• optimizing heterogeneous networks
• optimizing for adaptability and competing
incentives
• deployment application
• geological study
• declarative’s role
78
[Left] La Conchita, California – a small seaside community along Highway 101 south of Santa Barbara.
This landslide and debris flow occurred in the spring of 1995. A reoccurrence in 2005 claimed 4 lives and
resulted in 29 missing persons. [Right] Chehalis, Washington - landslides and debris flows during the79winter
storms of February 1996. Photographs by R.L. Schuster, U.S. Geological Survey.
Day Fire
Harvard Burn Site
[Above] The locations of the
2005-2006 and 2006-2007
debris flow deployment sites.
[Top Right] Smoke from the
Day Fire. [Middle Right]
Recently burned hillside in
Burbank, CA was the site of
two debris flows in 2005-2006
Winter season. [Bottom Right]
Base of the channel after
debris flow with remaining
sediment. [Bottom Left] Burnresilient vegetation is quickly
recovering just a few months
after the fires and debris flows.
80
[Above] Parshall flume used in conjunction with
water level logger at the channel’s choke-point.
[Top Right] Custom overland flow sensor for finegrained detection of water runoff. [Bottom
Right] Solar-powered base station for actuating
and gathering data from the wireless sensor
network, shown here connected to laptop
during testing.
82
declarative sensornets
in context
• how easy is managing sensornets with
declarative programming?
• are the network optimizations useful in
helping out a real deployment?
• event detection
• network management
83
timeline
•
Fa07:
•
complete current work on Optimizing Heterogeneous Networks
•
•
•
•
•
help with USGS 1st deployment
•
•
•
continue engineering and testing hardware and software for 1st round deployment
deploy in selected site, help with issues as they arise
Sp08+Fa08:
•
start and complete future work on Optimizing for Adaptability/Competing Incentives
•
•
•
•
•
•
implement generic rewriter
implement generic optimizations (concurrent work on optimization framework)
prove rewrite correctness
fully evaluate net planner
understand related work, decide on which course to pursue
design and implement generic rewrites
decide which platform(s) to use
augment runtime for runtime decision making
help with USGS 2nd deployment
Sp09:
•
write thesis
84
thanks
collaborators
Joe Hellerstein, Scott Shenker, Ion Stoica,
Lucian Popa, Arsalan Tavakoli
Tsung-Te Lai
Phil Levis, Jung Woo Lee, Aby John,
Kevin Klues
Daniel Malmon, Joel Johnson
85
references
•
•
•
•
•
•
•
[MFH05] Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong.
TinyDB: an acquisitional query processing system for sensor networks. ACM
Transactions on Database Systems (TODS), Volume 30 , Issue 1, March 2005.
[YG02] Yong Yao, J. E. Gehrke . The Cougar Approach to In-Network Query
Processing in Sensor Networks. Sigmod Record, Volume 31, Number 3. September
2002.
[LPC04] Philip Levis, Neil Patel, David Culler, and Scott Shenker. Trickle: A SelfRegulating Algorithm for Code Propagation and Maintenance in Wireless Sensor
Networks. In Proceedings of the First USENIX/ACM Symposium on Networked
Systems Design and Implementation, 2004.
[DIR07] Amol Deshpande, Zachary Ives, Vijayshankar Raman. Adaptive Query
Processing. Foundations and Trends in Databases: Vol. 1: No 1, pp 1-140, 2007.
[CDO97] C. Castelluccia, W. Dabbous, and S. O’Malley. Generating efficient
protocol code from an abstract specification. IEEE/ACM Trans. Netw., 5(4):514–524,
1997.
[VLC88] S. T. Vuong, A. C. Lau, and R. I. Chan. Semiautomatic implementation of
protocols using an estelle-c compiler. IEEE Trans. Softw. Eng., 14(3):384–393, 1988.
[HA89] D. Hernek and D. P. Anderson. Efficient automated protocol
implementation using rtag. Technical Report UCB/CSD-89-526, EECS Department,
University of California, Berkeley, Aug 1989.
references (2)
•
•
•
•
•
[KD98] N. Kabra and D. J. DeWitt. Efficient mid-query re-optimization of suboptimal
query execution plans. In SIGMOD ’98: Proceedings of the 1998 ACM SIGMOD
international conference on Management of data, (New York, NY, USA), pp. 106–
117, ACM Press, 1998.
[IFF+99] Z. G. Ives, D. Florescu, M. Friedman, A. Levy, and D. S. Weld, “An adaptive
query execution system for data integration,” in SIGMOD ’99: Proceedings of the
1999 ACM SIGMOD international conference on Management of data, (New York,
NY, USA), pp. 299–310, ACM Press, 1999.
[AH00] R. Avnur and J. M. Hellerstein, “Eddies: continuously adaptive query
processing,” in SIGMOD ’00: Proceedings of the 2000 ACM SIGMOD international
conference on Management of data, (New York, NY, USA), pp. 261–272, ACM
Press, 2000.
[MRW+04] R. Mahajan, M. Rodrig, D. Wetherall and J. Zahorjan. Experiences
Applying Game Theory to System Design. SIGCOMM 2004 Workshop on Practice
and Theory of Incentives and Game Theory in Networked Systems (PINS), Portland,
OR, August 2004.
[SDK+94] Michael Stonebraker, Robert Devine, Marcel Kornacker, Witold Litwin, Avi
Pfeffer, Adam Sah, and Carl Staelin. An Economic Paradigm for Query Processing
and Data Migration in Mariposa, Sequoia 2000 Technical Report 94/49, University of
California, Berkeley, CA, Apr. 1994.
backup slides
88
evaluation
93
a declarative architecture
• why rethink the architecture?
• disparate application requirements
• breaking of traditional abstraction boundaries
• what are the implications?
• architectural flexibility is essential
• put resource management in user’s hands
95
resource management
• memory
• processor
• energy
96
session state program
• % include “forwarding rule”
• % respond to message
• msg(@Crt,Dest,Src,NewLoad) :msg(@Crt,Src,Dest,Load), Crt == Dest,
NewLoad = modify_load(Load).
• % local state update
• local(@Crt,NewState) :msg(@Crt,Dest,Load), Crt == Dest,
local(State),
NewState = modify_state(State,Load).
5: SessionState Rewrite
97
Download