Vectorized XML? Peter Buneman University of Edinburgh 1 July, 2003

advertisement

VO meeting

Vectorized XML?

Peter Buneman

University of Edinburgh

1 July, 2003

1 July, 2003

How RDBMSs store tables

The textbook account is that rows are stored contiguously:

Name Age Shoesize

Joe 11

Fred 56

Jane 35

Sally 25

Roddy 44

Lori 42

8

11

8

8

11

7

Lori 42 7

Jane 35

Joe 11

Sally 25

8

8

8

Disk pages

Roddy 44

Fred 56

11

11

VO meeting 1 July, 2003

An alternative is to store the columns contiguously:

Name Age Shoesize

Joe 11

Fred 56

Jane 35

Sally 25

Roddy 44

Lori 42

8

11

8

8

11

7

Joe

Fred

Jane

Sally

Roddy

Lori

11

56

35

25

44

42

8

11

8

8

11

7

Disk pages

Batory ’79; Copeland and Khoshafian ’85; Boncz, Wilschut and Kersten ’98;

Ailamaki, DeWitt, Hill and Skounakis ’01.

VO meeting 1 July, 2003

Why do this?

• Projections come for “free”. E.g.

SELECT SingleField

FROM FiftyColumnTable requires 2% of the i/o.

• The received wisdom on joins is that they take 3-5 scans through the table.

We need only scan the join and “output” fields.

• The storage per column is smaller. Conventional techiques use “sparse” placement on pages.

' 3 × storage used for textual dump.

• Much better chance of getting entire columns into main memory.

VO meeting 1 July, 2003

So why don’t we always vectorize?

• Bad on highly selective queries

SELECT *

FROM FiftyColumnTable

WHERE KeyValue = 123456 requires 50 page accesses rather than one (in addition to indexing.)

• Deletions are a disaster!!

• Other transactions may be more expensive, e.g. tuple locking.

VO meeting 1 July, 2003

But the idea is being used

• Vertical partitioning (decomposing into “narrower” tables) is commonplace.

• Vectorized implementations have been developed.

– SyBase IQ (?)

– Three companies on Wall street. Two are “shipping” products.

VO meeting 1 July, 2003

Other advantages and tricks

• Good for combining with numerical/array processing. (Order of rows may be significant.)

• Easy to add/drop columns (schema evolution)

• Keys/foreign keys are “compiled” into indexes.

• Wall Street systems provide both SQL and APL interfaces to data.

– Excellent research topic: combine array processing optimization and database query optimization.

VO meeting 1 July, 2003

The Connection with XML

Note that XML does not support updates and transactions (at least we don’t know how to define them.)

Many scientific databases are also low transaction and write-mostly.

At first sight XML is irregular, and it’s not clear how to “vectorize” it. One possibility is to use the DTD to find table-like fragments.

Another is to use the Liefke-Suciu decomposition – originally designed for compressing XML...

VO meeting 1 July, 2003

The Liefke-Suciu Decomposition

h recipe-book i h recipe i h contributor i Annie h /contributor i h name i Salsa h /name i h comment i

Peter h i i loves h /i i this stuff name contributor h /comment i

Salsa Annie h ingredient i h name i tomatoes h /name i h qty i 1kg h /qty i h /ingredient i h ingredient i

Peter h name i onions h /name i h qty i 200g h /qty i h /ingredient i h /recipe i h /recipe-book i recipe−book recipe comment i loves ingredient ingredient this stuff name qty tomatoes 1kg name onions qty

200g

VO meeting 1 July, 2003

The files

recipe−book 1 recipe name

2

3 contributor comment i ingredient qty

4

5

6

7

8

Salsa

1/2/3 loves

1/2/5/6

Annie

1/2/4 tomatoes onions

1/2/7/3

Peter this stuff

1/2/5

1kg

200g

1/2/7/8

(a) The tag map

(b) The data files recipe−book 1 recipe 2 name 3 contributor 4 comment 5 ingredient 7 ingredient

1 2 3 # + 4 # + 5 # 6 # + # + 7 3 # + 8 # + + 7 3 # + 8 # + + +

(c) The skeleton

Salsa Annie Peter i 6 this stuff name 3 qty 8 name 3 qty 8 loves tomatoes 1kg onions 200g

VO meeting 1 July, 2003

The LS decomposition was designed for compression . It showed that better compression usually was obtained by compressing the “columns” individually.

Claim: it is also useful for querying.

Prior claim: it efficiently supports the SAX API

• The data is parsed. Checks for well-formedness have been performed. The

“identity” program runs 5 times faster on LS.

• One can build a lazy SAX parser. Files are not read until needed.

Simple “select/project” queries on downwards paths can be implemented with lazy SAX:

FOR $X IN DB/P

WHERE $X/A = "blahblah"

RETURN $X/B

VO meeting 1 July, 2003

Baseball

Shakespeare

Source Lazy Skeleton

672 106

7646 561

85

539

Baseball

Shakespeare

Source Lazy Skeleton

66

2139

11

40

1.3

31

Uncompressed

All sizes in kb

Compressed

Baseball query: names of all players with ERA > 0.3 (or something)

Shakespeare query: titles of all plays in which Falstaff appears.

NB. Compression of baseball skeleton shows much more regularity than indicated by DTD.

VO meeting 1 July, 2003

Compressing the Skeleton

(with Martin Grohe and Christoph Koch)

• The skeleton can be quite large.

• We’d like it to fit into main memory.

• We’d like to compress it in a “query-friendly” fashion

Idea: recognize common “subexpressions”. Essentially the same idea as is used in symbolic model checking for ordered binary decision diagrams.

VO meeting 1 July, 2003

VO meeting

Shared subexpressions

bib book paper paper title author author author title

(a) author title author book bib paper book

(3) bib

(2) paper title

(b) author title

(c) author

1 July, 2003

Notes

• In this example we have left out the “text here” stubs. In most DB examples text is always present or always absent, so the compression is no worse if they are added.

• The maximum compression (with edge cardinalities) is log(log n )

• In practice, relational databases under the vanilla XML encoding compress to a small constant size (independent of the number of tuples).

VO meeting 1 July, 2003

Equivalence of Skeletons

We represent a tree as a set of vertices V and a function γ : V → V the ordered sequence of children.

∗ that gives

We require the graph of γ to be acyclic and have a single root.

Write v → w if w is the i th child of v .

A schema is a set σ of unary relation names.

An instance I of σ consists of a tree ( V, γ ) together with a function from σ that associates a subset S n of V with each element n of σ

VO meeting 1 July, 2003

Equivalent Instances

S bib

1 v

1

2

S book

1 v

2

2

4

3

S title v

3

(a)

3

1 v

4

2

S paper v

5

S author

S bib

1 w

1

2 3

S book

1 w

2

2

3

4

S title w

3

1 w

4

2 w

5

(b)

Note: edge numbers here indicate child order.

1

S paper

2 w

6

S author w

7

VO meeting 1 July, 2003

Equivalence is Bisimulation

A bisimilarity relation on a σ -instance I is an equivalence relation ∼ on V s.t. for all v, w ∈ V with v ∼ w we have

• for all i , if v ∼ w and v v 0 ∼ w 0 , and

→ v 0 then there exists w 0 ∈ V s.t.

w → w 0 and

• for all n ∈ σ :

¡ v ∈ S n

⇐⇒ w ∈ S n

¢

.

VO meeting 1 July, 2003

For any σ -instance I there is a unique (up to node relabelling) minimal instance, and this can be computed in linear time.

A σ -instance and τ -instance are compatible if, when restricted to the common names σ ∩ τ , they are compatible.

A common extension of compatible σ - and τ -instances can be computed in quadratic time. (Product automaton construction.) The algorithm is linear in the size of the output.

VO meeting 1 July, 2003

VO meeting a a a a

Original b b b b

Evaluation of core XPATH

b a a b a b a a b a b a b a a a b a b a b a b a a b a

* b a

//a b b a a/a b a b a b a a b a a a b a b b a a b b b b a b a b a b a b b a b b a

*/a b a

//a/b b b a a/a/b b a b a

*/a/following::* b

1 July, 2003

Let Q be a Core XPath query and I a compressed instance. Then, Q can be evaluated on I in time O (2 | Q | ∗ | I | ) .

It is possible to find pathological examples for which the expansion is exponential.

However, the time is linear in the output, which is bounded by the original tree.

VO meeting 1 July, 2003

SwissProt

(457.4 MB)

DBLP

(103.6 MB)

TreeBank

(55.8 MB)

OMIM

(28.3 MB)

XMark

(9.6 MB)

Shakespeare

(7.9 MB)

Baseball

(671.9 KB)

TPC-D

(287.9 KB)

| V

T

10,903,569

| | V

M ( T )

83,427

| | E

M ( T )

792,620

|

| E

M ( T )

|

| E

T

|

7.3 % −

85,712 1,100,648 10.1 % +

2,611,932 171,820

222,755

6.6 % −

8.5 % +

2,447,728 323,256 853,242 34.9 % −

475,366 1,301,690 53.2 % +

206,454

321

4481

962

975

11,921

14,416

5.8 %

7.0 %

+

190,488

179,691

28,307

11,765

3,642

6,692

1,121

1,534

26

83

15

53

11,837 6.2 % −

27,438 14.4 % +

29,006 16.1 % −

31,910 17.8 % +

76

727

0.3 % −

2.6 % +

161

261

1.4 % −

2.2 % +

(tags ignored: “ − ”; all tags included: “ + ”)

VO meeting 1 July, 2003

... and VOTable?

The skeletons of arrays and tables are trivial!

10 9

Object

Id RA DEC

. . .

300

Descriptive stuff

300

VO meeting

TABLE

10 9

TR

300

TD

1 July, 2003

SwissProt

(457.4 MB)

DBLP

(103.6 MB)

TreeBank

(55.8 MB)

OMIM

(28.3 MB)

Q

3

Q

4

Q

5

Q

1

Q

4

Q

5

Q

1

Q

2

Q

2

Q

3

Q

4

Q

5

Q

5

Q

1

Q

2

Q

3

Q

1

Q

2

Q

3

Q

4

(1) parse time

56.921s

56.661s

64.971s

79.279s

60.036s

8.805s

8.795s

10.954s

14.056s

13.866s

8.942s

8.961s

9.647s

11.370s

7.883s

1.363s

1.380s

1.669s

2.085s

2.098s

| V

(2) bef.

M ( T )

|

84,314

84,314

84,166

84,071

84,480

1,246

1,246

2,469

2,191

2,191

349,229

349,229

357,254

348,582

350,671

963

963

977

1,030

1,023

(3)

| E bef.

M ( T )

|

796,059

796,059

798,354

808,771

814,307

176,280

176,280

187,761

188,368

188,368

913,743

913,743

938,785

912,549

917,197

13,819

13,819

13,893

14,766

12,243

(4) query time

1.748s

1.783s

1.664s

2.627s

2.825s

0.137s

0.136s

0.146s

0.313s

0.325s

8.884s

9.048s

4.659s

4.234s

9.910s

0.011s

0.011s

0.008s

0.016s

0.017s

(5) after

| V Q

( M ( T ))

|

84,314

84,344

84,184

84,071

84,999

1,246

1,265

2,469

2,196

2,200

349,229

362,662

361,222

348,582

364,141

963

964

977

1,042

1,024

(6) after

| EQ

( M ( T ))

|

796,059

796,087

798,371

808,771

815,281

176,280

176,302

187,761

188,368

188,368

913,743

945,576

948,205

912,549

948,170

13,819

13,819

13,893

14,781

12,243

(7)

#nodes sel. (dag)

1

1

106

1

3

740

202

9

249

18

1

1

1

1

1

1

1

1

1

4

(8)

#nodes sel. (tree)

1

249,978

46,679

1

991

1

100,313

32

3

3

1

1,778

203

9

624

1

8,650

26

3

4

VO meeting 1 July, 2003

XMark

(9.6 MB)

Shakespeare

(7.9 MB)

Baseball

(671.9 KB)

Q

4

Q

5

Q

1

Q

2

Q

3

Q

4

Q

5

Q

5

Q

1

Q

2

Q

3

Q

1

Q

2

Q

3

Q

4

(1) parse time

1.160s

0.810s

0.839s

0.844s

1.053s

1.457s

0.792s

0.894s

1.050s

0.958s

0.082s

0.082s

0.083s

0.116s

0.090s

| V

(2) bef.

M ( T )

|

3,780

3,780

3,755

3,733

4,101

1,520

1,520

1,560

1,586

1,194

26

26

46

1,215

48

(3) bef.

| E

M ( T )

|

11,993

11,993

13,578

14,747

12,639

31,048

31,048

31,253

31,364

29,418

76

76

805

14,413

870

(4) query time

0.074s

0.439s

0.033s

0.042s

0.061s

0.054s

0.055s

0.038s

0.046s

0.045s

0.001s

0.001s

0.001s

0.023s

0.003s

(5) after

| V Q

( M ( T ))

|

3,780

3,877

3,755

3,750

4,410

1,520

1,551

1,564

1,586

1,235

26

30

46

1,226

53

(6) after

| EQ

( M ( T ))

|

11,993

12,168

13,578

14,841

13,171

31,048

31,105

31,254

31,364

29,497

76

76

805

14,413

892

(7)

#nodes sel. (dag)

1

13

661

38

5

2

57

1

2

14

1

47

1

1

1

(8)

#nodes sel. (tree)

1

39

1,083

47

5

1

106,882

851

235

67

1

1,226

276

47

58

VO meeting 1 July, 2003

What about XQuery

Preceding techniques work for languages (e.g. XPath) that select nodes from the XML tree.

What about languages (e.g. XQuery) that construct new nodes?

What about highly selective queries, e.g.

FOR $X IN DB/P

WHERE random() < 0.001

RETURN $X

What about joins?

VO meeting 1 July, 2003

Generating new “tables”

On simple selections, the “edge cardinality compression” goes wrong.

10 8

1

2 1

3

. . .

.

Rather than modify the skeleton, generate a new skeleton and new data files.

Does this avoid exponential blow-up? – Not always

VO meeting 1 July, 2003

Exponential blow-up

0

0

0

(a)

1

1

1

1

1

1

0

0

1

1

1

Path //1/*/*/*

But what is the story in pratice?

0

1

1

1

1

0

1

1

1

1 0

1

1

1

1

(b)

0

1

1

1

1

1

1

1

1

0

1

1

1

1

1

1

1

1

VO meeting 1 July, 2003

Indexing into the data files

Byron Choi and Rob Hutchison

We want to evaluate highly selective queries

FOR $X IN DB/P

WHERE $X/keyfield=123456

RETURN $X/field1, $X/field2, ...

without scanning the data files for field1 , field2 . . . from the beginning.

How do we make the equivalent of a B-tree index work in a vectorized representation?

VO meeting 1 July, 2003

Markers

The skeleton is represented by a set S of tree addresses – sequences of integers.

Find a marking function M : [0 , n ] → S with the property that

• it is monotone w.r.t. lexicographic order, and

• it is “evenly distributed”

Example. For

1

2 1

3

10 8 an excellent choice is M ( i ) = 1 .ki.

1 where kn = 10 8

Now place markers at corresponding positions in the data files.

We can use an index to “fast forward” to the appropriate block of the data file

VO meeting 1 July, 2003

Decomposing the skeleton

Id RA DEC

. . .

300

Object

. . .

Id RA DEC

. . .

300

Object

Annotation

10 9

Object

. . .

Object

300

Id RA DEC

. . .

Annotation

300

Vectorized/conventional efficiency on queries on the “array-like” subset?

VO meeting 1 July, 2003

Conclusions

None yet.

Recent work by Byron Choi in XQuery implementations of astronomers’ queries on VOTable (astronomical XML data.) Not yet using indexing techinques discussed.

Lots more to look at. Particularly interesting is the interaction between DB

(query) and scientific (array processing) optimization.

Any ideas?

VO meeting 1 July, 2003

Download