lect 4

advertisement
ITCS 6163
Cube Computation
Two Problems
• Which cuboids should be materialized?
– Ullman et.al. Sigmod 96 paper
• How to efficiently compute cube?
– Agrawal et. al. vldb 96 paper
Implementing Data Cubes
efficiently
DSS queries must be answered fast!
1 min OK…
seconds great!
> 1 min NOT ACCEPTABLE!
One solution: Materialize frequently asked queries (or
supersets)
Picking the right set is difficult (O(2n))
What to materialize
• Nothing: pure ROLAP
• Everything: too costly
• Only part: key idea: many cells are computable from other
cells.
ABC
AB
BC
AC
Dependent and independent cells
Example TPC-D benchmark (supplier, part, customer,sales)
Q(find all sales for each part) is dependent on
Q (find all sales for each part and each supplier)
(p,all,all)  (p,s,all)
(dependency)
(p,all,all) = i,j (p, si, cj )
Example
PSC
1.PSC
2.PC
3.PS
4.SC
5.P
6.S
7.C
8.ALL
6 Million cells
6 Million cells
0.8 Million cells
6 Million cells
0.2 Million cells
0.05 Million cells
0.1 Million cells
1 cell
PC
SC
P
S
ALL
(Cube lattice)
PS
C
Example (2)
We want to answer (p, all,all) (Sales group-by part)
a) If we have 5 materialized, just answer.
b) If we have 2, compute ….Use 6 million cells, do they fit in
RAM?
Cost of a)  0.2 M
Cost of b)  6 M
Decisions, decisions...
How many views must we materialize to get good performance?
Given space S (on disk), which views do we materialize?
In the previous example we’d need space for 19 Million cells.
Can we do better?
Avoid going to the raw (fact table) data:  PSC (6 M)
PC (6M) can be answered using PSC (6 M) no advantage
SC (6 M) can be answered using PSC (6 M) no advantage
Example again
1 PSC
2 PC
3 PS
4 SC
5P
6S
7C







6M
6M
0.8M
6M
0.2 M
0.01M
0.1M
(about the same performance)
6M
0.8M
0.2M
0.01M
0.1
--------7.11M vs.
19 M
Formal treatment
Q1  Q2
(dependency)
Q(P)  Q(PC)  Q(PSC)
(lattice)
Add hierarchies
C (customers)
N (nation-wide cust. )
S (suppliers)
SN (nation-wide)
e.g., USA, Japan)
DF (domesticforeign)
ALL (all cust.)
ALL
P(parts)
Sz Ty
(size) (type)
ALL
Formal treatment(2)
C Sz (5M)
NP (5M)
N Sz
(1,250)
P (0.2 M)
Sz
(50)
ALL(1)
CTy (5.99M)
C (0.1M)
N Ty (3,750)
N25
Ty
150
CP (6M)
Formal Treatment(3)
Cost model:
Cost (Q) = # cells Qa
With indexes we can make this better!
How do we know the size of the views?
Sampling
Q  Qa
Optimizing Data Cube lattices
First problem (no space restrictions)
VERY HARD problem (NP-complete)
Heuristics:
Always include the “core” cuboid.
At every step you have materialized Sv views
Compute the benefit of view v relative to Sv as:
For each w  v define Bw
Let u be the view of least cost in Sv such that
w u
If Cost(v) < Cost (u) Bw = Cost(v)-Cost(u) (-)
else Bw = 0
Define B(V,Sv) = -  w  v B(w)
Greedy algorithm
Sv = {core view}
for i = 1 to k begin
select v not in Sv such that
B(v,Sv) is maximum
Sv = Sv  {v}
End
Sv
Sv=={a,b}
{a}
SvSv
==
{a,b,f}
{a,b,d,f}
100
a
50
b
20
30
e
d
1
g
A simple example
=
geBc
Bc
not
(c

g)
vgvh==vvcBc
C(v)=75
C(a)=100
-250 00
v
=
d
Bc
Bc
not
not
(c

(c
h)

d)
0
v
=
=
not
Bc
(c

not
g)
(c

e)
0
v==vvvedhf=
Bc
not
(c

f)
0 -50
=cebBc
Bb
C(v)
=not
50
C(a)
=100
v
Bb
v
=
Bb
not
h
not
not
(c
(b
Bc
(c


(

b
not
e)
h)
d)

(c
c)

h)
0
0
0
00 -30
00
Bd
Bd
not
(d

(d
c)

g)
0
=
Bb
not
(b

e)
0
Bd
not
(d

h)
0
Bd
C(v)=20
C(b)=50
v=
v c=f Bb
dBd
Bb
Bb
not
not
(b
not
(b

(

g)
b

f)
d)
0
0
vv=g=
Bc
C(v)=
75
C(a)=100
-25
BdBc
not
Bd
(d(not
g)
(d

e)
00
00
not
not
(d
c
f)

b)
0
Bc
Bd
not
(d
Bd
(c
not
e)
h)
(d

h)
0
0
Bc
C(v)=
C(v)
20
=
C(b)=50
75
C(a)
=100
-30
-25
Be
(e
d)C(b)=50
g) 0 0 00 0 0
Be
C(v)=75
C(b)=50
Bc
not
(c

Be
Be
not
(e
not

(e
h)e)

BcBd
Bc
Bc
not
not
not
not
(c
(d
(c

(c


g)

c)
f)
d)
Be
not
(e
g)
00 -20
Be
C(v)=30
BeBd
not
(e

f)
0
Bd
C(v)
=
50
C(a)
=100
-50
Bd
not
not
(e
not
Be
(d

(

d)
d
not
h)

c)
(e

h)
0
0
0
Be
C(v)=30
C(b)=50
-20
Bg
Bg
C(v)=75
C(v)=1
C(b)=50
C(b)=50
0
-49
Bd
not
(d

e)
0
Bf
not
(f

h)
0
Bg
C(v)=20
C(b)=50
-30
Bd
Bd
not
not
(d
(d


g)
f)
0
0
Be
Bd
C(v)
C(v)
=75
=
20
C(b)=50
C(a)=100
-80
(f

g)50(gC(a)=100
0-50 -20
Bg
C(v)=30
C(b)=50
75
BfBfBe
Be
C(v)=40
C(v)
=
C(a)=100
-60
Be
Bf
not
not
(f
Bg
(e


not
d)
e)
h)

h)
0
0
C(v)
=
75
C(a)
=100
-25
Bh
not
(h

g) 0-25
00
Bh
C(v)=75
C(b)=50
0
Be
C(v)
=
30
C(a)=100
-70
Bg
Bh
not
not
(g

(h
h)

d)
0
0
BeBf
Be
Be
not
not
not
(e
(e

(e

g)

f)
d)
0
C(v)=
75
C(a)=100
c
Bg
C(v)=1
Bh
C(v)=30
C(b)=50
C(f)=40
-49
-10
Bg
Bf
not
not
(g

(
f
f)

b)
0
0
Bf
not
(f

h)
0
Bg
Bf
C(v)=20
C(v)=30
C(v)
Bh
=
C(v)=10
C(b)=50
75
C(a)=100
C(f)=40
-30
-20
-25
-30
Bf
not
(f

e)
0
40 BfBg
Bh
C(v)=10
C(b)=50
-40
not
Bf
not
(f

(f
g)

d)
0
0
Bf
C(v)=75
C(v)
=40
C(b)=50
C(a)
=
100
0
-60
BhBg
not
(hd)
g) C(a)=100
0-50
Bh
C(v)=40
C(v)
=
C(b)=50
50
-10
Bg
not
(h
(g
h)
0
Bh
Bg
C(v)=30
C(v)
=
C(b)=50
75
C(a)=100
-20
-25
f
Bg
C(v)
=
30
C(a)=100
-70
BgC(v)
not
(g

f)
0
BgBh
Bg
C(v)=75
C(v)
=1
C(a)=100
=
C(b)=50
20
C(a)=100
-99
0
-80
Bh
C(v)
=
50
C(a)=100
-50
Bh
Bh
C(v)=10
C(v)
=
C(a)=100
75
C(a)=100
-90
-25
Bh
C(v)
=30
C(a)=100
-70
BhBhnot
BhC(v)
(h
not=40
(h
g) C(a)=100
d)
0 -600
h 10
B(c,Sv)=25
B(c,Sv)=25
B(d,Sv)
B(d,Sv)
==6060
B(b,Sv)
=
250
B(c,Sv)
=125
125
B(b,Sv)
B(b,Sv)
B(b,Sv)
B(b,Sv)
=
=
250
=
250
=
250
250
B(c,Sv)
B(c,Sv)
B(c,Sv)
=
=
=
125
125
B(c,Sv)=25
B(c,Sv)=25
B(c,Sv)=25
B(b,Sv)
B(b,Sv)
=
=
B(d,Sv)
B(d,Sv)
250
250
=
=
60
60
B(c,Sv)
B(c,Sv)
=
=
125
125
B(c,Sv)
=
50
B(e,Sv)=50
B(e,Sv)=50
B(g,Sv)
==
B(c,Sv)
= 49
50
B(d,Sv)=60
B(d,Sv)
=
160
B(c,Sv)
50
B(d,Sv)=60
B(d,Sv)
B(d,Sv)
=
=
160
160
B(e,Sv)
B(e,Sv)
=
=
210
210
B(e,Sv)=50
B(d,Sv)
B(d,Sv)==B(g,Sv)
160
160
==
49B(e,Sv)
B(e,Sv)==210
210
B(c,Sv)
50
B(d,Sv)=60
B(h,Sv)=30
B(e,Sv)
=
60
B(f,Sv) =70
B(f,Sv)
=
120
B(g,Sv)
=
99
B(f,Sv)
B(f,Sv)==120
120
B(g,Sv) = 99
== 49
60
B(e,Sv)
B(f,Sv)
=70
B(g,Sv)
B(h,Sv)=40
B(h,Sv) = 90
B(g,Sv) = 49
MOLAP example
Hyperion’s Essbase:
www.hyperion.com to download white paper and product
demo.
• Builds a special secondary-memory data structure to store the
cells of the core cuboid.
• Assumes that data is sparse and clustered along some
dimension combinations
•
Chooses dense dimension combinations.
•
The rest are sparse combinations.
Structures
Two levels:
• Blocks in the first level correspond to the dense
dimension combinations. The basic block will have the
size proportional to the product of the cardinalities for
these dimensions. Each entry in the block points to a
second-level block.
• Blocks in the second level correspond to the sparse
dimensions. They are arrays of pointers, as many as the
product of the cardinalities for sparse dimensions. Each
pointer has one of three values: null (non-existent data),
impossible (non-allowed combination) or a pointer to an
actual data block.
Data Example
Dimensions
Departments (Sales,Mkt)
Time
Geographical information
Product
Distribution channels
Departments will generally
have data for each Time
period. (so the two are the
dense dimension
combination)
Geographical information,
Product and Distribution
channels, on the other hand
are typically sparse (e.g.,
most cities have only one
Distribution channel and
some Product values).
Structures revisited
S,1Q
S,2Q
S,3Q
S,4Q
Geo., Product, Dist
Data block
Data block
M,1Q M,2Q M,3Q M,4Q
Allocating memory
Define member structure (e.g., dimensions)
Select dense dimension combinations and create upper level
structure
Create lower level structure.
Input data cell: if pointer to data block is empty, create new
else insert data in data block
Problem 2: COMPUTING
DATACUBES
Four algorithms
• PIPESORT
• PIPEHASH
• SORT-OVERLAP
• Partitioned-cube
Optimizations
• Smallest-parent
– AB can be computed from ABC, ABD, or ABCD.
Which one should be use?
• Cache-results
– Having computed ABC, we compute AB from it while
ABC is still in memory
• Amortize-scans
– We may try to compute ABC, ACD, ABD, BCD in one
scan of ABCD
• Share-sorts
• Share-partitions
PIPESORT
Input: Cube lattice and cost matrix.
Each edge (eij in the lattice is annotated with two costs:
S(i,j) cost of computing j from i when i is not sorted
A(i,j) cost of computing j from i when i is sorted
Output: Subgraph of the lattice where each cuboid (group-by) is
connected to a single parent from which it will be
computed and is associated with an attribute order in
which it will be sorted. If the order is a prefix of the
order of its parent, then the child can be computed
without sorting the parent (cost A); otherwise it has to
be sorted (cost B). For every parent there will be only
one out-edge labeled A.
PIPESORT (2)
Algorithm: Proceeds in levels, k = 0,…,N-1 (number of
dimensions). For each level, finds the best way of computing
level k from level k+1 by reducing the problem to a weighted
bypartite problem.
Make k additional copies of each group-by (each node
has then, k+1 vertices) and connect them to the same children
of the original.
From the original copy, the edges have A costs, while
the costs from the copies have S costs.
Find the minimum cost matching in the bypartite
graph. (Each vertex in level k+1 matched with one vertex in
level k.)
Example
AA
BB
CC
AB
AB
AB
AB
AC
AC
AC
AC
BC
BC
BC
BC
22
1010
55
1212
1313
2020
Transformed lattice
A
B
AB(2) AB(10) AC(5) AC(12)
C
BC(13)
BC(20)
Explanation of edges
A
AB(2)
This means we
have AB (no need
to sort)
AB(10)
This means that we
really have BA (we
need to sort it to get A)
PIPESORT pseudo-algorithm
Pipesort:
(Input: lattice with A() and S() edges costs)
For level k = 0 to N-1
Generate_plan(k+1)
For each cuboid g in level k+1
Fix sort order of g as the order of the
cuboid connected to g by an A
edge;
Generate_plan
Generate_plan(k+1)
Make k additional copies of each level k+1 cuboid;
Connect each copy to the same set of vertices as the
original;
Assign costs A to original edges and S to copies;
Find minimum cost matching on the transformed
graph;
Example
ALLALL
ALL
A
A
A B
B
B C
C
C D
D
D
AB BAAC ACAD ADBC CBBD DBCD CD
BA
AC
AD
CB
DB
CD
ABCCBA
CBA
ABDBAD
BAD(DBA)
ACDACD
BCDDBC
ACD
(ADC)(CDA)
DBC
ABCD
CBAD
CBAD(BADC)(ACDB)(DBCA)
PipeHash
Input: lattice
PipeHash chooses for each vertex the parent with the
smallest estimated size. The outcome is a minimum spanning
tree (MST), where each vertex is a cuboid and an edge from i to
j shows that i is the smallest parent of j.
Available memory is not usually enough to compute all
the cuboids in MST together, so we need to decide what cuboids
can be computed together (sub-MST), and when to allocate and
deallocate memory for different hash-tables and what attribute
to use for partitioning data.
PipeHash
Input: lattice and estimated sizes of cuboids
Initialize worklist with MST of the search lattice
While worklist is not empty
Pick tree from worklist;
T’ = Select-subtree of T to be executed next;
Compute-subtree(T’);
Select-subtree
Select-subtree(T)
If memory required by T less than available, return(T);
Else, let S be the attributes in root(T)
For any s  S we get a subtree Ts of T also
rooted at T including all cuboids that contain s
Ps= maximum number of partitions of root(T)
possible if partitioned on s.
.
Choose s such that mem(Ts)/Ps < memory
available and Ts the largest over all subsets of S.
Remove Ts from T. (put T-Ts in worklist)
Compute-subtree
Compute-subtree
numP = mem(T’) * f / mem-available
Partition root of T’ into numP
For each partition of root(T’)
For each node n in T’
Compute all children of n in one scan
If n is cached, saved to disk and
release memory occupied by its
hash table
Example
ALL
ALL
Partition
inin
AAA
Partition
Partition
in
A
A
A
B
CPartition
D in A
A
B
C
D
To
disk
AB
AC
AD
AB
AB
AC
AC
BC
ADAD CD
BD
AB
AC
To
disk
To disk CD
To disk
AB
BC
BD
AB
ABC
ABD
ACD
ABC
ABC
ABD
ACD
BCD
ABC
ABC
ABD
ACD
ACD
ABC
BCD
ABCD
ABCD
ABCD
ABCD
Remaining Subtrees
all
A
B
AB
C
BC
ABC
D
CD
BD
BCD
ABCD
OVERLAP
Sorted-Runs:
Consider a cuboid on j attributes {A1,A2,…,Aj}, we use
B= (A1,A2,…,Aj) to denote the cuboid sorted on that order.
Consider S = (A1,A2,…,Al-1,Al+1,…,Aj), computed
using the one before. A sorted run R of S in B is defined as:
R = S (Q) where Q is a maximal sequence of tuples of
B such that for each tuple in Q, the first l columns have the
same value.
Sorted-run
B = [(a,1,2),(a,1,3),(a,2,2),(b,1,3),(b,3,2),(c,3,1)]
S = first and third attribute
S = [(a,2),(a,3),(b,3),(b,2),(c,1)]
Sorted runs: [(a,2),(a,3)] [(a,2)] [(b,3)] [(b,2)] [(c,1)]
Partitions
B and S have a common prefix (A1… Al-1)
A partition of the cuboid S in B is the union of sorted runs
such that the first l-1 columns of all the tuples of the sorted
runs have the same values.
[(a,2),(a,3)] [(b,2),(b,3)] [(c,1)]
OVERLAP
Sort the base cuboid: this forces the sorted order in which the
other cuboids are computed
ALL
A
AB
ABC
B
AC
C
BC
ABD
ABCD
D
AD
CD
ACD
BD
BCD
OVERLAP(2)
If there is enough memory to hold all the cuboids, compute all.
(very seldom true). Otherwise, use the partition as a unit of
computation: just need sufficient memory to hold a partition.
As soon as a partition is computed, tuples can be pipelined to
compute descendant cuboids (same partition) and then written
to disk. Reuse then the memory to compute the next partition.
Example XYZ->XZ Partitions:[(a,2),(a,3)] [(b,2),(b,3)] [(c,1)]
XYZ=[(a,1,2),(a,1,3),(a,2,2),(b,1,3),(b,3,2),(C,3,1)]
Compute C,3,1
(a,1,2),(a,1,3),(a,2,2)
(b,1,3),(b,3,2)
cell in XYZ,
cellsXZ.
incells
XYZ,
Useinthem
XZ.
XYZ,
Use
touse
compute
them
themtoto
(c,1) in
compute
XZ.
Then[(b,2),(b,3)]
(a,2),(a,3)
write these
inin
cells
XZ.
XZ.
Then
toThen
disk.
write
write
allthese
thesecells
cellstotodisk.
disk
OVERLAP(3)
Choose a parent to compute a cuboid: DAG.
Goal: minimize the size of the partitions of a cuboid, so less memory is
needed. E.g., it is better to compute AC from ACD than from ABC, (since the
sort order matches and the partition size is 1). This is a hard problem.
Heuristic: maximize the size of the common prefix.
ALL
A
AB
ABC
B
AC
C
BC
ABD
ABCD
D
AD
CD
ACD
BD
BCD
OVERLAP (4)
Choosing a set of cuboids for overlapped computation,
according to your memory constraints. To compute a cuboid in
memory, we need memory equal to the size of its partition.
Partition sizes can be estimated from cuboid sizes by using
some distribution (uniform?) assumption. If this much memory
can be spared, then the cuboid will be marked as in Partition
state. For other cuboids, allocate a single page (for temporary
results), these cuboids are in SortRun state. A cuboid in
partition state can have its tuples pipelined for computation of
its descendants.
A cuboid can be considered for computation if it is the root, or
its parent is marked as in Partition State.
The total memory allocated to all cuboids cannot be more than
the available memory.
OVERHEAD (5)
Again, a hard problem… Heuristic: traverse the tree in BFS
manner.
ALL
A(1)
B(1)
C(1)
D(5)
AB(1) AC(1) BC(1) AD(5) CD(40) BD(1)
ABC(1)
ABD(1)
ABCD
ACD(1)
BCD(50)
OVERLAP (6)
Computing a cuboid from its parent:
Output: The sorted cuboid S
foreach tuple  of B do
if (state == Partition) then process_partition();
else process_sorted_run( );
OVERLAP (7)
Process_partition:
If the input tuple starts a new partition, output the current partition
at the end of the cuboid, start a new one.
If the input tuple matches with an existing tuple in the
partition, update the aggregate.
Else input tuple aggregate.
Process_sorted_run:
If input tuple starts a new sorted_run, flush all the pages of current
sorted_run, and start a new one.
If the input tuple matches the last tuple in the sorted_run,
recompute the aggregate.
Else, append the tuple to the end of the existing run.
Observations
In ABCD  ABC, the partition size is 1. Why?
In ABCD  ABD, the partition size is equal to the
number of distinct C values, Why?
In ABCD  BCD the partition size is the size of the
cuboid BCD, Why?
Running example with 25 pages
ALL(1)
A(1)
ALL
C(1)
B(1)
B(1)C(1)
D(5)
D(5)
CD(10)
AB(1) AC(1) BC(1)
AD(5) CD(40)
BC(1)
CD(1)BD(1)
BD(1)
ABC(1)
ABD(1)
ABCD(1)
ACD(1)
BCD(1)
BCD(10)
1
page
Other issues
• Iceberg cube
–
it contains only aggregates above certain
threshold.
– Jiawei Han’s sigmod 01 paper
Download