IFLOW: Self-managing distributed information flows Brian Cooper Yahoo! Research

advertisement
IFLOW: Self-managing distributed
information flows
Brian Cooper
Yahoo! Research
Joint work with colleagues at Georgia Tech: Vibhore Kumar, Zhongtang
Cai, Sangeetha Seshadri, Greg Eisenhauer, Karsten Schwan
and others
Overview






Motivation
Case study: inTransit
Architecture
Flow graph deployment/reconfiguration
Experiments
Other aspects of the system
2
Motivation

Lots of data produced in lots of places

Examples: operational information systems, scientific
collaborations, end-user systems, web traffic data
3
Airline example
Check seats
Flights arriving
Rebook missed
connections
Shop for flights
Concourse display
Flights departing
Bags scanned
Gate display
Customers check-in
Weather updates
Catering updates
Baggage display
Home user display
FAA updates
4
Previous solutions

Tools for managing distributed updates




Pub/sub middlewares
Transaction Processing Facilities
In-house solutions
Times have changed




How
How
How
How
to
to
to
to
handle larger data volumes?
seamlessly incorporate new functionality?
effectively prioritize service?
avoid hand-tuning the system?
5
Approach

Provide a self-managing distributed data flow graph
Weather data
Select ATL data
Terminal or web
Predict delays
Flight data
Check-in data
Generate customer
messages
Correlate flights
and reservations
6
Approach


Deploy operators in a network overlay
Middleware should self-manage this deployment


Provide necessary performance, availability
Respond to business-level needs
7
IFLOW
X-Window Client
Coordinates
Calculates
Distance and Bonds
Coordinates
+Bonds
FLIGHTS
Radial Distance
Molecular
Dynamics Experiment
OVERHEADDISPLAY
WEATHER
COUNTERS
IPaq Client
ImmersaDesk
AirlineFlowGraph
CollaborationFlowGraph
Sources ->{FLIGHTS, WEATHER, COUNTERS}
Sinks ->{DISPLAY}
Flow-Operators ->{JOIN-1, JOIN-2}
Edges ->{(FLIGHTS, JOIN-1), (WEATHER, JOIN-1),
Sources ->{Experiment}
Sinks ->{IPaq, X-Window, Immersadesk}
Flow-Operators ->{Coord, DistBond, RadDist, CoordBond}
Edges ->{(Experiment, Coord), (Coord, DistBond),
{
(JOIN-1, JOIN-2), (COUNTERS, JOIN-2),
(JOIN-2, DISPLAY)}
Utility ->[Customer-Priority, Low Bandwidth Utilization]
}
{
(DistBond, RadDist), (DistBond, RadDist),
(RadDist, IPaq), (CoordBond, ImmersaDesk),
(CoordBond, X-Window)}
Utility ->[Low-Delay, Synchronized-Delivery]
}
IFLOW middleware
[ICAC ’06]
8
Case study

inTransit


Query processing over distributed event streams
Operators are streaming versions of relational operators
9
Architecture
Query?
Application layer
Middleware
layer
Data-flow parser
ECho pub-sub
Stones
PDS
Messaging
inTransit
Distributed
Stream
Management
Infrastructure
Flow-graph control
Underlay layer
IFLOW
[ICDCS ’05]
10
Application layer

Applications specify data flow graphs


Can specify directly
Can use SQL-like declarative language
STREAM N1.FLIGHTS.TIME, N7.COUNTERS.WAITLISTED, N2.WEATHER.TEMP
FROM N1.FLIGHTS, N7.COUNTERS, N2.WEATHER
WHEN N1.FLIGHTS.NUMBER=’DL207’
AND N7.COUNTERS.FLIGHT_NUMBER= N1.FLIGHTS.NUMBER
AND N2.WEATHER.LOCATION=N1.FLIGHTS.DESTINATION;
N1
N2
N7
⋈
⋈
‘DL207’
N10
11
Middleware layer

ECho – pub/sub event delivery


Event channels for data streams
Native operators



E-code for most operators
Library functions for special cases
Stones – operator containers

Queues and actions
Channel 1
⋈
Channel 3
Channel 2
12
Middleware layer

PDS – resource monitoring


Nodes update PDS with resource info
inTransit notified when conditions change
CPU
CPU?
CPU
CPU
13
Flow graph deployment

Where to place operators?
14
Flow graph deployment


Where to place operators?
Basic idea: cluster physical nodes
15
Flow graph deployment

Partition flow graph among coordinators


Coordinators represent their cluster
Exhaustive search among coordinators
N1
?
⋈
N2
⋈
?
‘DL207’
?
N10
N7
16
Flow graph deployment

Coordinator deploys subgraph in its cluster

Uses exhaustive search to find best deployment
⋈
?
17
Flow graph reconfiguration

Resource or load changes trigger reconfiguration


Clusters reconfigure locally
Large changes require inter-cluster reconfiguration
⋈
18
Hierarchical clusters

Coordinators themselves are clustered


Coordinators form a hierarchy
May need to move operators between clusters

Handled by moving up a level in the hierarchy
19
What do we optimize

Basic metrics


Bandwidth used
End to end delay
1
0.9
0.8

Autonomic metrics


Business value
Infrastructure cost
0.7
0.6
Business utility 0.5
0.4
0.3
0.2
0
10
0.1
20
0
30
0
1
2
3
End-to-end delay
40
4
User priority
5
6
7
8
9
50
10
[ICAC ’05]
20
Experiments

Simulations




GT-ITM transit/stub Internet topology (128 nodes)
NS-2 to capture trace of delay between nodes
Deployment simulator reacts to delay
OIS case study



Flight information from Delta airlines
Weather and news streams
Experiments on Emulab (13 nodes)
21
Approximation penalty
700
Centralized
Decentralized
End-to-end delay (ms) .
600
500
400
300
200
100
0
4
6
8
10
12
14
Nodes in flow graph
Flow graphs on simulator
22
Impact of reconfiguration
End-to-end delay (ms) .
400
350
300
250
200
150
100
50
Dynamic
Static
0
0
500
1000
1500
2000
Time (seconds)
10 node flow graph on simulator
23
Impact of reconfiguration
68
66
End-to-end delay (ms)
Dynamic
Network congestion
Static
Increased processor load
64
62
60
58
56
54
0
500
1000
1500
2000
Time (seconds)
2 node flow graph on Emulab
24
Different utility functions
500
actual-utility
450
cost
delay
450
400
350
400
300
350
250
200
300
150
250
Delay (msec)
Utility or cost (10^3 dollars/sec)
500
100
200
50
0
150
Utility
Cost
Delay
Optimization criterion
Simulator, 128 node network
25
Query planning

We can optimize the structure of the query graph



A different join order may enable a better mapping
But there are too many plan/deployment possibilities to
consider
Use the hierarchy for planning


Plus: stream advertisements to locate sources and deployed
operators
Planning algorithms: top-down, bottom-up
[IPDPS ‘07]
27
Planning algorithms

Top down
A⋈B⋈C⋈D
A⋈B⋈
A
B
A⋈B
C⋈D
C
⋈
D
C⋈D
28
Planning algorithms

Bottom up
A⋈B
A⋈B
A
B
A⋈B
C
D
⋈C⋈D
A⋈B
A⋈B⋈C⋈D
29
Bandwidth cost per unit time (dollars)
Query planning
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Phased
Combined
100 queries, each over 5 sources, 64 node network
30
Availability management

Goal is to achieve both:



These goals often conflict!


Performance
Reliability
Spend scarce resources on throughput or
availability?
Manage tradeoff using utility function
31
Fault tolerance

Basic approach: passive standby


Log of messages can be replayed
Periodic “soft-checkpoint” from active to standby
⋈
X

⋈
Performance versus availability (fast recovery)


More soft-checkpoints = faster recovery, higher overhead
Choose a checkpoint frequency that maximizes utility
[Middleware ’06]
32
Proactive fault tolerance

Goal: predict system instability
33
Proactive fault tolerance
34
Mean time to recovery
38
IFLOW beyond inTransit
inTransit Pub/sub Science app …
Self-managing information flow
Complex infrastructure
39
Related work

Stream data processing engines



Content-based pub/sub


STREAM, Aurora, TelegraphCQ, NiagaraCQ, etc.
Borealis, TRAPP, Flux, TAG
Gryphon, ARMADA, Hermes
Overlay networks


P2P
Multicast (e.g. Bayeux)

Grid

Other overlay toolkits

P2, MACEDON, GridKit
40
Conclusions

IFLOW is a general information flow middleware



inTransit distributed event management infrastructure




Self-configuring and self-managing
Based on application-specified performance and utility
Queries over streams of structured data
Resource-aware deployment of query graphs
IFLOW provides utility-driven deployment and reconfiguration
Overall goal



Provide useful abstractions for distributed information systems
Implementation of abstractions is self-managing
Key to scalability, manageability, flexibility
41
For more information


http://www.brianfrankcooper.net
cooperb@yahoo-inc.com
42
Download