Slides

Massive Scale-out of

Expensive Continuous Queries

Erik Zeitler and Tore Risch

Uppsala Database Laboratory

Uppsala University

Outline

1.

Introduction

2.

Stream splitting strategies for scale-out

3.

Evaluating stream splitting strategies

4.

Cost model and heuristic

5.

Energy efficiency

6.

Related work

7.

Conclusions and future work

31 Aug 2011 Erik Zeitler and Tore Risch 2

user or programmer

01001011

11001011

31 Aug 2011

Input data streams

Query processing software

Stream data access software

Query result data stream metadata

Erik Zeitler and Tore Risch stored data

3

Research Questions

 How to ensure scalable CQ execution

• with growing input stream rate?

• with high CQ execution cost?

By scale-out.

 CQs are scaled out by splitting the input stream.

• applications require customizable input stream splitting, called splitstream

• both tuple route and broadcast allowed

31 Aug 2011

CQ

CQ

Erik Zeitler and Tore Risch merge

4

Research Questions

 How to ensure scalable CQ execution

• with growing input stream rate?

• with high CQ execution cost?

By scale-out.

 CQs are scaled out by splitting the input stream.

• applications require customizable input stream splitting, called splitstream

• both tuple route and broadcast allowed

 How to split massive streams over massively parallel CQs?

• By parallelization of splitstream CQ

CQ

CQ

CQ merge

CQ

CQ

31 Aug 2011 5

Outline

1.

Introduction

2.

Stream splitting strategies for scaleout

3.

Scale-up of stream splitting strategies

4.


5.

Energy efficiency

6.

Related work

7.



Defining stream splitting

 splitstream(stream s , integer q , function rfn , function bfn )

 vector of stream sv s splitstream

 User defines rfn and bfn

 rfn(object tpl, integer q)  integer rfnLRB (event e , integer q)  integer as q select expressway( e ) where eventtype( e ) = 0;

 bfn(object tpl)  boolean bfnLRB (event e )  boolean as select eventtype( e ) = 2;

 rfn and bfn for streams are analogous to fragmentation and replication conditions in distributed DBMS

 Unlike DDBMS, execution of rfn and bfn is parallelized

31 Aug 2011 Erik Zeitler and Tore Risch 7 sv

Naïve (flat) splitstream implementation: fsplit fsplit(stream s, integer q, function rfn , function bfn )

 vector of stream sv

CQ

CQ fsplit

Expensive stream splitting computations



Bottleneck!

31 Aug 2011 Erik Zeitler and Tore Risch

CQ

CQ

CQ

CQ

CQ

CQ

8

Tree shaped splitstream implementation: maxtree maxtree(stream s, integer q, function rfn , function bfn )

 vector of stream sv fsplit

CQ

CQ

CQ fsplit

• Bottleneck is alleviated

[Zeitler and Risch,

DASFAA 2010]

• but still problematic fsplit

CQ

CQ

CQ

CQ

CQ


Scaled-out splitstream: parasplit parasplit(stream s, integer q, function rfn , function bfn )


PR fsplit

Window router distributes entire windows

31 Aug 2011 fsplit

Window splitter

Stream merge


CQ

CQ

CQ

CQ

CQ

CQ

CQ

CQ

10

Parasplit: route – //fsplit – //(merge – CQ) parasplit(stream s, integer q, function rfn , function bfn )


CQ

CQ

CQ

CQ

PR fsplit

Window router distributes entire windows fsplit

CQ

CQ

CQ

CQ

31 Aug 2011

Window splitter

Stream merge

Erik Zeitler and Tore Risch 11

PR

31 Aug 2011

Tree shaped window routing: parasplit*

PR

PR

PR fsplit fsplit fsplit fsplit fsplit fsplit fsplit fsplit fsplit


Outline

1.

Introduction

2.

Stream splitting strategies

3.

Scale-up of stream splitting strategies

4.


5.

Energy efficiency

6.

Related work

7.



Experimental set-up

www.cs.brandeis.edu/~linearroad

Hardware

Linux cluster

 Up to 70 nodes

 Each node has 2x quad-core Intel®

Xeon®

E5430@2.66GHz,

6 MB L2$.

 TCP/IP over GbE

Performance number L : Number of xways the DSMS can handle


LRB result

Performance number L : Number of xways the DSMS can handle name

Aurora

Commercial sys A

SPC

Xquery

DataCell stream schema org

Brandeis,

Brown, MIT

IBM

ETHZ

CWI

ETHZ

SCSQ maxtree

SCSQ parasplit

UU

UU year L cores comment

2004 2.5

1

2004 0.5

1

2006 2.5

170 3GHz Xeon

2007 1.5

2009 1

2010 5

1

4 1.4s avg RT

4

2010 64 48

2011 512 560

D disabled (later verified in mySQL)

D disabled


1 000,00

800,00

600,00

400,00

200,00

0,00

0

Splitstream stream rate

parasplit* parasplit maxtree fsplit

1 Gbps wire speed

31 Aug 2011

100 200


300 q

400 500

16

Window router stream rate fsplit

PR fsplit

W p fsplit

W – physical window size p – number of parallel fsplit


CQ

CQ

CQ

CQ

CQ

CQ

CQ

CQ

17

31 Aug 2011

Impact of window size W in window router network bound for large enough windows

1000,00

800,00 p=4 p=64

600,00

400,00

200,00

0,00

0 15 5

W [kB]

10


31 Aug 2011

Impact of window size W in window router when scaling p

1000,00

800,00

600,00

400,00

200,00

0,00

0 p=4 p=128 p=512 p=64 p=256

15 5

W [kB]

10


Parasplit*

Tree shaped window router

1 000,00

900,00

800,00

700,00

600,00

500,00

400,00

300,00

200,00

100,00

0,00

0

W = 16 kB

31 Aug 2011 window router tree (parasplit*) single window router (parasplit)

100 200 p p

300


400 500

20

Outline

1.

Introduction

2.


3.


4.

Cost model and heuristic

5.

Energy efficiency

6.

Related work

7.



Eliminate p parasplit(stream s, integer q, function rfn , function bfn )


PR p

 Given

• Input stream rate Φ

D

• Parallelism of continuous query

 Automatically determine

• fsplit parallelism p q fsplit fsplit q


CQ

CQ

CQ

CQ

CQ

CQ

CQ

CQ

22

Cost model for fsplit

consume split emit(R

1

)

...

R

1

...

emit(R q

) R q

C fsplit



 cr

 cs ( o

 r

 q

 b )

 ce ( r

 q

 b )

 cr – read cost per tpl (read + de-marshal) cs – split cost per tpl (execute rfn and bfn ) ce – emit cost per tpl (marshal + print) o – omit % r – routing % according to rfn and bfn b – broadcast % q – number of output streams


Cost model for merge in CQ

S

1

...

S p consume(S

1

)

...

merge compute split emit(R

1

)

...

consume(S p

)

C

CQ emit(R w

)

 cr

 p



 cp

 cm





O

R

1

...

R w cr – read cost per tpl (read + de-marshal) cp – poll cost per tpl cm – merge cost per tpl

O – cost of executing the CQ and emit its result


Cost model for parasplit

CQ fsplit

CQ

CQ









CQ

PR fsplit

CQ p

CQ fsplit q b



CQ

C

PR

 cr

W

C

C fsplit



CQ

 cr cr

W





 p cs

W

 cs



 cp o





 ce

W r cm

 q

 b





O



 ce

 r

 q



CQ p can be eliminated using cost model, but requires extensive profiling everywhere


Heuristic for estimating

p

 Assume

• 1% broadcast tuples (configurable)

C

• 0% omitted tuples (configurable) fsplit

 cr

W

 cs

 o

 r

 q

 b



 ce

 r

 q

 b

 fsplit fsplit



 cs

 ce

 

0 .

99



0 .

01

 q



PR fsplit p fsplit

 Measure Φ on fsplit

(1) fsplit rfn and bfn , q = 1: cs + ce = 1/Φ fsplit q

 Estimate p by 









D

 ( 1 ) fsplit





0 .

99



0 .

01

 q










CQ

CQ

CQ

CQ

CQ

CQ

CQ

CQ

26

1 000,00

900,00

800,00

700,00

600,00

500,00

400,00

300,00

200,00

100,00

0,00

0 p according to heuristics vs.

p using exact cost model parasplit cost model

Too high p (p=q)

Too low p (p=1)

400 500

31 Aug 2011

100 200

CQ parallelism,

300 q


Outline

1.

Introduction

2.


3.


4.


5.

Energy efficiency

6.

Related work

7.



Estimating energy efficiency,

η

 How much extra energy does parasplit consume in comparison to fsplit?

fsplit

CQ

CQ

CQ

CQ

PR fsplit

CQ

 Conservatively assume energy consumption proportional to CPU usage:

 Useful work

• p ∙ C fsplit

 Overhead

• C

PR

• q ∙ C

CQ

( O =0)

 

C

PR

 fsplit p

 p



C fsplit

C fsplit

 q

 ( O



C

CQ

0 )

CQ

CQ

CQ


31 Aug 2011

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

0

Measuring energy efficiency

parasplit* parasplit cost model

Too high p (p=q)

Too low p (p=1)

100 200

CQ parallelism,

300 q


400 500

30

Related work

 Nobody else has investigated strategies for scalable customizable stream splitting

 IBM SPADE/System S [Andrade et al 2009]

• Splitstream operator with broadcast capabilities

• Streaming throughput degrades when scaling q

 Event based systems [Brenna et al 2009]

• Custom stream splitting shown to be a bottleneck

 Gigascope [Johnson et al 2008]

• Assumes specialized stream splitting hardware

• No customizable stream splitting

 GSDM [Ivanova, Risch 2005]

• Parallel execution of expensive UDFs

• More limited parallelization

 Streaming MapReduce [Condie et al 2010]

• Does not handle scalable stream splitting

 [Balkesen, Tatbul 2011]

• Distributing entire windows over CQs

• q ≤ 4



 Naïve stream splitting is prohibitive for scale-out of CQs

 Parasplit

• eliminates the bottleneck of stream splitting, providing network bound stream rates

 Parasplit*

• provides network bound stream rates for highly scaled-out stream splitting

 Push selection predicates from CQ to

 Improve energy efficiency

 High Availability

 SCSQ home page rfn of splitstream

• http://www.it.uu.se/research/group/udbl/SCSQ.html



Extra material

 Window router tree

 Cost model

 LRB

• Parallelization of LRB

 Additional related work


Single process window router, p =64


Tree shaped window router, p =64

Parasplit

+ tree shaped window router

= parasplit*


Heuristics for estimating

p

 Given

• Input stream rate Φ

D

• Parallelism of continuous query

 Determine fsplit parallelism p q

• If max stream rate of fsplit is Φ fsplit choose p such that p ∙ Φ fsplit

≥ Φ

D

 C

CQ

 cr

 p



 cp

 cm





O

• C

CQ increases with

 Must choose p p carefully

31 Aug 2011

Φ

D

PR p

Erik Zeitler and Tore Risch fsplit fsplit fsplit q

CQ

CQ

CQ

CQ

CQ

CQ

CQ

CQ

37

Linear Road Benchmark

 Simulates vehicles travelling

(and colliding)

• on a number of expressways

• using variable tolling

• based on traffic conditions and accident proximity

 Input: One stream of position reports and historical queries

(account balance, daily tolls)

 Continuous queries: Toll notifications, accident notifications

 Output: Four result streams of responses to historical and continuous queries:

0.

toll alerts

1.

accident alerts

2.

account balance responses

3.

daily expenditure responses

 L-rating: Number of xways processed within RT constraints


Parallelization of LRB using fsplit fsplit

CQ CQ fsplit

CQ CQ fsplit

Scale up q fsplit fsplit

31 Aug 2011 union union groupby toll alerts accident alerts account balance answers


Daily expenditure queries D are excluded here.

Daily expenditure data is managed by a regular DBMS.

39

Related work

 Nobody else has investigated strategies for scalable customizable stream splitting

 IBM SPADE/System S [Andrade et al 2009]

• Splitstream operator with broadcast capabilities

• Streaming throughput degrades when scaling q

 Event based systems [Brenna et al 2009]

• Custom stream splitting shown to be a bottleneck

 Gigascope [Johnson et al 2008]

• Assumes specialized stream splitting hardware

• No customizable stream splitting

 GSDM [Ivanova, Risch 2005]

• Parallel execution of expensive UDFs

• More limited parallelization

 Streaming MapReduce [Condie et al 2010]

• Does not handle scalable stream splitting

 [Balkesen, Tatbul 2011]

• Distributing entire windows over CQs

• q ≤ 4


Other related work

 Medusa [Balazinska et al 2004]

• Parallel DSMS

• Dynamic migration of operators between nodes

• Without scale-out, heavy operators are bottlenecks

 Dryad [Isard et al 2007]

• User defined process graphs in QL (edges + vertices)

• SCSQ automatically generates such graphs from splitstream

 SCOPE [Chaiken et al 2008], Map-reduce-merge [Yang et al 2007]

• All these are batch systems, not DSMSs

 Distributed DBMS

• rfn and bfn are analogous for streams to fragmentation and replication conditions in DDBMS

• DDBMS do not scale out fragmentation and replication, while splitstream parallelizes rfn and bfn.


Slides

Outline

Research Questions

Research Questions

Outline

Defining stream splitting

Outline

Experimental set-up

LRB result

Splitstream stream rate

Outline

Cost model for fsplit

Cost model for merge in CQ

Cost model for parasplit

Heuristic for estimating

Outline

Estimating energy efficiency,

Measuring energy efficiency

Related work

Conclusions and future work

Extra material

Heuristics for estimating

Linear Road Benchmark

Related work

Other related work

Related documents

Products

Support

Slides

Outline

Research Questions

Research Questions

Outline

Defining stream splitting

Outline

Experimental set-up

LRB result

Splitstream stream rate

Outline

Cost model for fsplit

Cost model for merge in CQ

Cost model for parasplit

Heuristic for estimating

Outline

Estimating energy efficiency,

Measuring energy efficiency

Related work

Conclusions and future work

Extra material

Heuristics for estimating

Linear Road Benchmark

Related work

Other related work

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib