Query Processing in Mobile P2P Databases IGERT Seminar Presentation Bo Xu

advertisement
Query Processing in Mobile P2P
Databases
IGERT Seminar Presentation
Bo Xu
joint work with Ouri Wolfson
Talk outline






Introduction
System Model
The MARKET Algorithm
Evaluation
Extension to CTS
Conclusion and Future Work
July 12, 2016
IGERT seminar
2
Query Processing Environments
Motivation: a general purpose query processing strategy mobile
disconnected wireless ad-hoc networks
Wireless ad-hoc
networks
Query longevity
nuous
Instantaneous
Sensor
Mobile
MANET
Static
Vehicular Sensor Network (VSN)
July 12, 2016
Connectivity
Sparse
Dense
GPS receiver
chemical spill detector
still/video camera
vibration sensor
acoustic detector
IGERT seminar
3
Store-and-forward to deal with sparseness
Q
QA
A
r
q
Q
A
qA
July 12, 2016
IGERT seminar
A
4
Issues with Store-and-forward

How to manage limited memory, power, and
bandwidth?

Which reports to save/transmit?
July 12, 2016
IGERT seminar
5
Difficulty of Store-and-forward
Case: Each mobile node is interested in every data-item
Assume that the trajectories of all nodes is known a priori at a central
server. If memory, energy, and bandwidth are bounded at mobile nodes,
then the problem of determining whether a set of data-items can be
disseminated to all the mobile nodes is NP-complete.
Mobile P2P: Trajectories unknown a priori; Heuristics needed
July 12, 2016
IGERT seminar
6
Talk outline






Introduction
System Model
The MARKET Algorithm
Evaluation
Extension to CTS
Conclusion and Future Work
July 12, 2016
IGERT seminar
7
Mobile P2P Database
Pda’s, cell-phones, sensors, hotspots, vehicles, with short-range
wireless capabilities
report 8
A
Local query
Local database
C
query C
query A
report 1
report 2
report 3
query B
report 4
report 5
B
• Applications coexist
• Variable report sizes
• A peer can be a produce, consumer, and broker
July 12, 2016
IGERT seminar
8
Queries

A query Q maps each report R to a match
degree:
0  match( R, Q )  1

Examples:



Top parking slots given my current location
match(R,Q)=e-t-d
Profile with expertise “children-periodontics”
Similarity between two images
July 12, 2016
IGERT seminar
9
Query/report Dissemination



Two peers within transmission range exchange
queries and reports
Least relevant reports that do not fit in local broker
database are purged
Exchange not necessarily synchronous (periodic
broadcast)
July 12, 2016
IGERT seminar
10
Talk outline






Introduction
System Model
The MARKET Algorithm
Evaluation
Extension to CTS
Conclusion and Future Work
July 12, 2016
IGERT seminar
11
Ranking Factors
Rank of a report R is determined by

Demand: What fraction of peers are querying R


Supply: What fraction of peers already have R


Probability that a peer is interested in R
Probability that a peer has R
Size of R
July 12, 2016
IGERT seminar
12
Rank of a report
expected benefit = demand(R)*(1supply(R))
reports
benefit
0.7
0.5
0.4
0.5
0.8
reports database
0.3
Rank(R)= demand(R)*(1supply(R))
size(R)
July 12, 2016
IGERT seminar
13
Report Ranking: sample demand
r2
r5
q1
q7
O1
r3
O2 r6
q2
q6
r1
r2
r5
r4
q1
q3
q4
q7
O1
O2
O3
O3
reports
relation
r1
r4
r7
r8
q3
q4
q5
q8
r1
r4
r7
r8
queries
relation
(a)
r4
r3
r6
r1
q2
q4
q3
q6
q3
q4
q5
q8
(b)
Queries relation is FIFO maintained
July 12, 2016
IGERT seminar
14
Rank of Reports

Demand for R
n
 match( R, Q )
i
i 1
n


Qi’s are the members of the queries relation
Size of the queries relation determined based
on Hoeffding’s inequality
confidence _level  1  2e
2 n ( confidence_interval_width) 2
E.g., if n=108, then with 95% chance the demand
estimation error is smaller than 0.08
July 12, 2016
IGERT seminar
15
How does peer O determine supply(R)?
number _ of _ peers _ that _ have _ R
supply ( R) 
total _ number _ of _ peers


A parametric formula giving the supply is
beyond the state of the art
O machine-learns supply(R) based on metadata of R:



Age of R
Number of times O sighted R from other peers
etc.
July 12, 2016
IGERT seminar
16
Computing Supply by Machine-learning
MAchine LEarning based Novelty rAnking (MALENA)
Reports database of O
report -id
R1
R4
R2
R7
Report
description
…
…
…
…
aro
fin
1
1
2
4
3
2
4
2
aro: The age rank order within O’s reports database
fin: The number of times O has sighted the report from other peers
July 12, 2016
IGERT seminar
17
MALENA
Reports database of O
(a)
reportReport aro
id
description
R1
R2
…
…
Tracking
Set of O
fin
R1
1
3
R2
2
1
R3
R1, R2
Advertise
to peer
B
Examples
Examplescreated
createdby
ReceiveREQ
Examples set
ES of O
insert
aro
fin
label
1
3
negative
old
2
1
new
positive
(b)
Tracking
Set of O
Reports database of O
reportReport aro
id
description
R1
…
1
R2
…
fin
2
July 12, 2016
report- Report
aro
id
description
R2
1
R3
R1
…
1
Tracking
Set of O
fin
B
R1
4
Reports database of O
(c)
ReceiveREQ
RequestR2)
R2
((R1,R2),
R1
IGERT
seminar
4
R2
ReceiveRPT
(R4)
18
MALENA Implementation Considerations

Minimize overhead



No need to actually store examples
Model incrementally built
Bayesian learning a simple but effective
method
July 12, 2016
IGERT seminar
19
Talk outline






Introduction
System Model
The MARKET Algorithm
Evaluation
Extension to CTS
Conclusion and Future Work
July 12, 2016
IGERT seminar
20
Comparison with RANDI (MDM’07)
throughput (matches/peer)
peer density (D) = 20, energy allocation fraction (F) = 0.15
20 production
peers within
transmission
range
report
rate (P)
= 0.1 reports/second
140
120
100
80
60
MARKET
RANDI
Ideal Benchmark
40
20
0
0
30
60 90 120 150 180 210 240 270 300
response-time bound (second)
RANDI=MARKET-supply
July 12, 2016
throughput (matches/peer)
mobility model=random way point, average motion speed=1 mile/hour
transmission range=100 meters, mean of reports database size=100Kbytes
queries database size=100 queries
report size uniformly distributed between 1K and 2K bytes
0.1 report produced per second
peer density (D) = 1, energy allocation fraction (F) = 0.15
report1production
rate transmission
(P) = 0.1 reports/second
peer within
range
140
MARKET
RANDI
120
Ideal Benchmark
100
80
60
40
20
0
0
3600 7200 10800 14400 18000 21600 25200 28800
response-time bound (second)
MARKET half as good as ideal benchmark
MARKET twice better than RANDI
IGERT seminar
21
Comparison with LRU and LFU
throughput (matches/peer)
mobility model=iMotes traces
mean of reports database size=150Kbytes
queries database size=10 queries
report size uniformly distributed between 2K and 20K bytes
0.1 report produced per second, transmission size=100Kbytes
5.5
5
4.5
MARKET
4
RANDI
3.5
LRU
LFU
3
2.5
2
0
10
20
30
40
50
response-time bound (second)
(results obtained by Fatemeh Vafaee)
July 12, 2016
IGERT seminar
22
Evaluation of MALENA (TAAS’09)
turn-over: peers enter/exit system
injection: number of peers that have a report initially
mobility model=iMotes traces, reports database size=100 reports
2 reports produced per second, transmission size=10 reports
throughput (reports/peer)
throughput (reports/peer)
high-turn-over/high-injection
low-turn-over/low-injection
low-turn-over/low-injection
Bluetooth,
Bluetooth,
M=100,
M=100,
f=2, f=2,
q=0.1,
q=0.1,
flooding
flooding
low-turn-over/low-injection
high-turn-over/high-injection
200
14000
180
12000
160
10000
140
1208000
100
806000
604000
40
MALE
MALE
NANA
2000
aro-ranking
aro-ranking
20
fin-ranking
fin-ranking
0 0
0 50 10
5 10
15 15
20 20
25 25
30 30
35 3540 4045 4550 5055556060
response-time
response-time
bound
bound
(minute)
(minute)
MALENA always follows the best indicator
July 12, 2016
IGERT seminar
23
Application: K-nearest-neighbors
query-point
sink



Query: K-nearest-neighbors of a fixed location (query-point)
Reports: current locations of mobile sensors
match(Q,R): in reverse proportion to the distance from query point
July 12, 2016
IGERT seminar
24
Itinerary based KNN processing
Phase I: Query delivered to the sensor closest to query point
Phase II: Query traverses an itinerary to collect answers
Phase III: Answers returned to sink
July 12, 2016
IGERT seminar
25
Simulation Results
mobility model=random way point, average motion speed=1 mile/hour
transmission range=100 meters
report size=24 bytes, query size=16 bytes
mean of reports database size=100 reports
one location report produced at each sensor per second
100%
accuracy
80%
60%
MARKET
40%
Upper bound of itinerary
20%
0%
1
2
3
4
5
6
peer density
MARKET is especially suitable for sparse environments
July 12, 2016
IGERT seminar
26
Talk outline






Introduction
System Model
The MARKET Algorithm
Evaluation
Extension to CTS
Conclusion and Future Work
July 12, 2016
IGERT seminar
27
TrafficInfo: Disseminating Traffic
Information in VANET’s
July 12, 2016
IGERT seminar
28
What does relevance mean in TrafficInfo
B
B
A
A
A report is relevant if it changes the route
July 12, 2016
IGERT seminar
29
Which factors indicate relevance of
report?




Distance to the reported road segment
Type of road segment
Speed variance
…
July 12, 2016
IGERT seminar
30
Conceptual Learning Procedure




An example is created for a received
report
The example is labeled positive if the
report changes route and negative
otherwise
Individual vs. group
How to deal with aggregation?
July 12, 2016
IGERT seminar
31
Conclusion



sensor-rich
+ short-range
Query
processing
environment
wireless
 Mobile P2P
Store-and-forward enables in-network
processing in mobile disconnected
networks
Ranking is important for dealing with
memory, bandwidth, and energy
constraints
July 12, 2016
IGERT seminar
32
Future Work

Multimedia reports



Utilization of metadata
Integration of stateless and stateful
approaches
Starvation/fairness
July 12, 2016
IGERT seminar
33
Thanks!
Questions?
July 12, 2016
IGERT seminar
34
802.11 Basics




3 modes: transmitting, receiving, listening
(order of power consumption)
When listening: if detecting a message
destined to host  receive-mode
Time divided into slots, 20microsecs each
Transmission:



Listen for 1 time slot
If channel free start broadcast (observe collision
possible)
Broadcast may last for many time slots
July 12, 2016
IGERT seminar
35
Energy Efficiency of a Broadcast
successfully receive the
broadcast from x
X
Collisions occur at neighbor
Throughput (Th) =
(expected number of neighbors that successfully receive broadcast)  (broadcast size)
Power efficiency (PE) =
July 12, 2016
Th
En
IGERT seminar
36
Computation of Throughput
X
Y
Conditions for successful reception at an arbitrary node Y
1. No green node inside starts to broadcast at the same
time slot with X
2. No transmission from any purple node overlaps with
that from X
July 12, 2016
IGERT seminar
37
Energy Constraints
Energy consumed by a 802.11 network
interface for transmitting a message of size M
bytes
En=fM+g
For 802.11 broadcast, g=26610-6 Joule,
f=5.2710-6Joule/byte

July 12, 2016
IGERT seminar
38
Experimental MP2P Projects (Pedestrians)






7DS – Columbia University (web pages)
iClouds – Darmstadt Univ. (incentives)
MoGATU – UMBC (specialized query processing,
e.g., collaborative joins)
PeopleNet – NUS, IIS-Bangalore (Mobile commerce,
information type  location baazar)
MoB – Wisconsin, Cambridge (incentives,
information resources e.g. bandwidth)
Mobi-Dik – Univ. of Illinois, Chicago (brokering,
physical resources, bandwidth/memory/power
management)
July 12, 2016
IGERT seminar
39
Vehicular Projects

Inter-vehicle Communication and Intelligent
Transportation:






CarTALK 2000 is a European project
VICS (The Vehicle Information and Control System) is a
government-sponsored system in Japan with an 11-year track
record
FleetNet, an inter-vehicle communications system, is being
developed by a consortium of private companies and universities in
Germany
IVI (Intelligent Vehicle Initiative) and VII (Vehicle
Infrastructure Integration), the US DOT
MP2P provides data management capabilities on top of
these communication systems
Grassroots, TrafficView, SOTIS, V3 – P2P dissemination of
traffic info to reduce travel times
July 12, 2016
IGERT seminar
40
RANk-based DIssemination (RANDI)



Ranking of reports
Bandwidth/energy aware
Exchange enhances





Consumer functionality
Broker functionality
Consumer: Answer local query (pull)
Broker: Transmit reports most likely requested by
future-encountered peers (push)
Transmission trigger:


Encounter
New reports
July 12, 2016
IGERT seminar
41
RANDI
When two peers meet they conduct a two-phase exchange:
Phase 1
answers
satisfied as a consumer (pull)
Phase 2
enhanced as a broker (push)
Phase 1: Exchange queries and receive answers (pull)
Phase 2: Exchange more reports using available energy/bandwidth (push)
Combination of:
unicast
(thin line) and
broadcast (thick lines) to enable overhearing.
July 12, 2016
IGERT seminar
42
RANDI (Cont’d)
To solve problem with static peers:
Two interaction modes which combine pull and push
new reports
• Query-response: triggered by discovery of new neighbors
• Relay: triggered by receipt of new reports
 Disseminate to existing neighbors
July 12, 2016
IGERT seminar
43
7DS
P2P mode: each node periodically broadcasts its query and receives
reports from neighboring peers. No strategy to determine query
frequency and transmission size. Cache management based on webpage expiration time.
query
query
reports
reports
query
July 12, 2016
query
IGERT seminar
44
PeopleNet
Reports are randomly selected for exchanging and saving upon
encountering.
Peer A
Peer B
Peer A
Peer B
random-spread
before exchange
Peer A
Peer B
after exchange
Peer A
Peer B
random-swap
July 12, 2016
before exchange
IGERT seminar
after exchange
45
7DS
Each peer periodically broadcasts its query and receives reports from
neighboring peers. No strategy to determine query frequency and
transmission size. Cache management based on web-page expiration
time.
query
query
reports
reports
query
July 12, 2016
query
IGERT seminar
46
PeopleNet
Reports are randomly selected for exchanging and saving upon
encountering.
Peer A
Peer B
Peer A
before exchange
July 12, 2016
Peer B
after exchange
IGERT seminar
47
Mobile Local Search: Applications

transportation





social networking (wearable website)





Search for victims in a rubble
asset management and tracking


Sale on an item of interest at mall
Music-file exchange
emergency response


Personal profile of interest at a convention
Singles matchmaking
Floating BBS
mobile electronic commerce


Announce sudden stop, malfunctioning brake light, patch of ice
Floating car data
Dissemination of multi-media traffic information (picture, video, voice)
Search close-by taxi customer, parking slot, ride-share
Sensors on containers exchange security information => remote
checkpoints
tourist and location-based-services

Closest ATM
July 12, 2016
IGERT seminar
48
Applications – Common features


Mobile/stationary peers
Resources of interest



in a limited geographic area
Short time duration
Can be solved by fixed servers, but


Unlikely solution
Proposed mp2p paradigm can enhance fixed
solution (reliability, performance, coverage)
July 12, 2016
IGERT seminar
49
MARKET
When two peers meet they conduct a two-phase exchange:
Local query
Phase 1
answers
satisfied as a consumer (pull)
Phase 2
enhanced as a broker (push)
Phase 1: Exchange subscriptions and receive answers (pull)
Phase 2: Exchange more publications using available energy/bandwidth (push)
Combination of:
unicast
(thin line) and
broadcast (thick lines) to enable overhearing.
July 12, 2016
IGERT seminar
50
MARKET (Cont’d)
To solve problem with static peers:
Two interaction modes which combine pull and push
new publications
• Query-response: triggered by discovery of new neighbors
• Relay: triggered by receipt of new publications
 Disseminate to existing neighbors
July 12, 2016
IGERT seminar
51
Query in static disconnected network
Q
r
Q
A
q
Q
A
In-network query processing may not be possible
July 12, 2016
IGERT seminar
52
Query in static connected sensor network
Q
A
r
Q
A
A
A
Q
A
A
qA
q
Q
Data transmission delay is 0.
July 12, 2016
QA
A
Answer can be obtained instantaneously
IGERT seminar
53
Query in static disconnected network
Q
r
Q
A
q
Q
A
In-network query processing may not be possible
July 12, 2016
IGERT seminar
54
Query in mobile disconnected network
Query processing enabled by mobility and store-and-forward
qA
r
QA
A
q
A
One hop case
July 12, 2016
IGERT seminar
55
Query in mobile disconnected network
Q
QA
A
r
q
Q
A
qA
A
The
answer
is in
disseminated
only after
anitanswer
node receives query
Query
Multil-hop
can be
case
network processed,
but
is delayed
Query processing
doesn’t
control motion.
First alogrithm
stage: query
disseminated
during encounter
July 12, 2016
IGERT seminar
56
Download