Logical View C-Store: A Column-oriented

advertisement
Today’s Papers
EECS 262a
Advanced Topics in Computer Systems
Lecture 17
• C-Store: A Column-oriented DBMS*
Mike Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch
Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden,
Elizabeth O’Neil, Pat O’Neil, Alex Rasin, Nga Tran, Stan Zdonik.
Appears in Proceedings of the ACM Conference on Very Large
Databases(VLDB), 2005
• Database Cracking+
Stratos Idreos, Martin L. Kersten, and Stefan Manegold. Appears in the
3rd Biennial Conference on Innovative Data Systems Research
(CIDR), January 7-10, 2007
C-Store / DB Cracking
October 29th, 2014
• Wednesday: Comparison of PDBMS, CS, MR
John Kubiatowicz
Electrical Engineering and Computer Sciences
University of California, Berkeley
• Thoughts?
http://www.eecs.berkeley.edu/~kubitron/cs262
*Some
+Some
slides based on slides from Jianlin Feng, School of Software, Sun Yat-Sen University
slides based on slides from Stratos Idreos, CWI Amsterdam, The Netherlands
10/29/2014
Relational Data: Logical View
cs262a-F14 Lecture-17
2
C-Store: A Column-oriented DBMS
• Physical layout:
Row store
Name
Age
Department
25
Math
10K
Bill
27
EECS
50K
Jill
24
Biology
80K
Column store
Record 1
Salary
Bob
or
Record 2
Column
Column
Column
1
2
3
Record 3
Relation or Tables
10/29/2014
cs262a-F14 Lecture-17
3
10/29/2014
cs262a-F14 Lecture-17
4
Row Stores
Row Store: Columns Stored Together
Data
• On-Line Transaction Processing (OLTP)
Rid = (i,N)
– ATM, POS in supermarkets
Page i
Rid = (i,2)
Rid = (i,1)
• Characteristics of OLTP applications:
–
–
–
–
Transactions that involve small numbers of records (or tuples)
Frequent updates (including queries)
Many users
Fast response times
20
N
Slot
Array
• OLTP needs a write-optimized row store
...
16
2
24
1
N
# slots
SLOT DIRECTORY
– Insert and delete a record in one physical write
Pointer
to start
of free
space
Record id = <page id, slot #>
• Easy to add new record, but might read unnecessary
data (wasted memory and I/O bandwidth)
10/29/2014
cs262a-F14 Lecture-17
5
Current DBMS Gold Standard
10/29/2014
cs262a-F14 Lecture-17
6
OLAP and Data Warehouses
• On-Line Analytical Processing (OLAP)
• Store columns from one record contiguously on disk
– Flexible reporting for Business Intelligence
• Use B-tree indexing
• Characteristics of OLAP applications
–
–
–
–
• Use small (e.g. 4K) disk blocks
• Align fields on byte or word boundaries
Transactions that involve large numbers of records
Frequent Ad-hoc queries and infrequent updates
A few decision-making users
Fast response times
• Data Warehouses facilitate reporting and analysis
– Read-Mostly
• Conventional (row-oriented) query optimizer and
executor (technology from 1979)
• Other read-mostly applications
– Customer Relationship Management (Siebel, Oracle)
– Catalog Search in E-Commerce (Amazon.com, Bestbuy.com)
• Aries-style transactions
10/29/2014
cs262a-F14 Lecture-17
7
10/29/2014
cs262a-F14 Lecture-17
8
Column Stores
Row and Column Stores
• Logical data model: Relational Model
• Key Intuition: Only read relevant columns
– Example: Ad-hoc queries read 2 columns out of 20
• Multiple prior column store implementations
–
–
–
–
Sybase IQ (early ’90s, bitmap index)
Addamark (i.e., SenSage, for Event Log data warehouse)
KDB (Column-stores for financial services companies)
MonetDB (Hyper-Pipelining Query Execution, CIDR’05)
• Only read necessary data, but might need multiple
seeks
10/29/2014
cs262a-F14 Lecture-17
9
Read-Optimized Databases
SQL Server
DB2
Oracle
1 Joe 45
2 Sue 37
……
…
Sybase IQ
MonetDB
KDB
C-Store
10/29/2014
cs262a-F14 Lecture-17
Rows versus Columns
1
2
… Joe
Sue
…
row stores
column stores
Joe 45
45
37
…
Joe 45
reconstruct
project
row stores
10
column stores
45
Joe
seek
1
Joe
45
2
Sue
…
37
• Effect of column-orientation on performance?
1 Joe 45
2 Sue 37
……
…
– Read-less, seek more so depends on prefetching, query selectivity,
tuple width, competing query traffic
single file
3 files
…
10/29/2014
cs262a-F14 Lecture-17
11
10/29/2014
cs262a-F14 Lecture-17
…
12
Architecture of Vertica C-Store
C-Store Technical Ideas
• Column Store with some “novel” ideas (below)
• Only materialized views on
each relation (perhaps many)
Writeable Store (WS)
• Active data compression
• Column-oriented query
executor and optimizer
Tuple Mover
Read-optimized Store (RS)
Read-optimized Store (RS)
Read-optimized Store (RS)
Read-optimized Store (RS)
• Shared-nothing architecture
• Replication-based concurrency control and recovery
10/29/2014
cs262a-F14 Lecture-17
13
10/29/2014
cs262a-F14 Lecture-17
Basic Concepts
Example C-Store Projection
• A logical table is physically represented as a set of
projections
• LINEITEM(shipdate, quantity, retflag,
suppkey | shipdate, quantity, retflag)
– First sorted by shipdate
– Second sorted by quantity
– Third sorted by retflag
• Each projection consists of a set of columns
– Columns are stored separately, along with a common sort order
defined by SORT KEY
• Each column appears in at least one projection
• Sorting increases locality of data
– Favors compression techniques such as RunLength Encoding (see Elephant paper)
• A column can have different sort orders if it is stored
in multiple projections
10/29/2014
cs262a-F14 Lecture-17
14
15
10/29/2014
cs262a-F14 Lecture-17
16
Example: Join over Two Columns
C-Store Operators
• Selection
– Produce bitmaps that can be efficiently combined
• Mask
– Materialize a set of values from a column and a bitmap
• Permute
– Reorder a column using a join index
• Projection
– Free operation!
– Two columns in the same order can be concatenated for
free
• Join
– Produces positions rather than values
10/29/2014
cs262a-F14 Lecture-17
17
Column Store has High Compressibility
10/29/2014
cs262a-F14 Lecture-17
18
Compression Methods
• Each attribute is stored in a separate column
• Dictionary
– Related values are compressible (versus values of separate attributes)
• Compression benefits
… ‘low’ …
… 00 …
… ‘high’ …
… 10 …
… ‘low’ …
… 00 …
… ‘normal’ …
… 01 …
• Bit-pack
– Reduces the data sizes
– Improves disk (and memory) I/O performance by:
» reducing seek times (related data stored nearer together)
» reducing transfer times (less data to read/write)
» increasing buffer hit rate (buffer can hold larger fraction of data)
– Pack several attributes inside a 4-byte word
– Use as many bits as max-value
• Delta
– Base value per page
– Arithmetic differences
• No Run-Length Encoding (unlike Elephant paper)
10/29/2014
cs262a-F14 Lecture-17
19
10/29/2014
cs262a-F14 Lecture-17
20
C-Store Use of Snapshot Isolation
Other Ideas
• Snapshot Isolation for Fast OLAP/Data Warehousing
• K-safety: Can handle up to K-1 failures
– Allows very fast transactions without locks
– Can read large consistent snapshot of database
– Every piece of data replicated K times
– Different projections sorted in different ways
• Divide into RS and WS stores
• Join Tables
– Read Store is Optimized for Fast Read-Only Transactions
– Write Store is Optimized for Transactional Updates
– Construct original tuples given covering projects
– Vertica gave up on Join Tables – too expensive, require super-projections
• Low Water Mark (LWM)
– Represents earliest epoch at which read-only transactions can run
– RS contains tuples added before LWM
• High Water Mark (HWM)
– Represents latest epoch at which read-only transactions can run
10/29/2014
cs262a-F14 Lecture-17
21
Evaluation?
10/29/2014
cs262a-F14 Lecture-17
22
Summary
• Columns outperform rows in listed workloads
– Reasons:
» Column representation avoids reads of unused attributes
» Query operators operate on compressed representation,
mitigating storage barrier problem
» Avoids memory-bandwidth bottleneck in rows
• Storing overlapping projections, rather than the
whole table allows storage of multiple orderings of a
column as appropriate
• Better compression of data allows more orderings in
the same space
• Results from other papers:
• Series of 7 queries against C-Store vs two commercial DBMS
– C-Store faster In all cases, sometimes significantly
• Why so much faster?
–
–
–
–
Column Representation – avoid extra reads
Overlapping projections – multiple orderings of column as appropriate
Better data compression
Query operators operating on compressed data
10/29/2014
cs262a-F14 Lecture-17
– Prefetching is key to columns outperforming rows
– Systems with competing traffic favor columns
23
10/29/2014
cs262a-F14 Lecture-17
24
Is this a good paper?
• What were the authors’ goals?
• What about the evaluation/metrics?
• Did they convince you that this was a good
system/approach?
• Were there any red-flags?
• What mistakes did they make?
• Does the system/approach meet the “Test of Time”
challenge?
• How would you review this paper today?
10/29/2014
cs262a-F14 Lecture-17
BREAK
25
DB Physical Organization Problem
Timeline
• Many DBMS perform well and efficiently only
after being tuned by a DBA
• Sample workload
• Analyze performance
• DBA decides:
• Prepare estimated physical design
– Which indexes to build?
– On which data parts?
– and when to build them?
• Queries
Very complex and time consuming process!
What about:
• Dynamic, changing workloads?
• Very Large Databases?
10/29/2014
cs262a-F14 Lecture-17
27
10/29/2014
cs262a-F14 Lecture-17
28
Database Cracking
Database Cracking
• Solve challenges of dynamic environments:
•
•
•
•
•
– Remove all tuning and physical design steps, but still get
similar performance as a fully tuned system
• How?
• Design new auto-tuning kernels
No monitoring
No preparation
No external tools
No full indexes
No human involvement
• Continuous on-the-fly physical reorganization
– DBA with cracking (operators, plans, structures, etc.)
– Partial, incremental, adaptive indexing
• Designed for modern column-stores
10/29/2014
cs262a-F14 Lecture-17
29
Cracking Example
Column A
13
16
4
and R.A < 14
30
• The first time a range query is posed on an attribute
A, a cracking DBMS makes a copy of column A,
called the cracker column of A
• A cracker column is continuously physically reorganized based on queries that need to touch
attribute such as the result is in a contiguous space
• For each cracker column, there is a cracker index
– Triggers physical reā€organization of the database
where R.A > 10
cs262a-F14 Lecture-17
Cracking Design
• Each query is treated as an advice on how data should be stored
Q1: select * from R
10/29/2014
9
2
12
7
1
19
3
14
11
8
6
10/29/2014
cs262a-F14 Lecture-17
31
10/29/2014
cs262a-F14 Lecture-17
32
Cracking Algorithms
Cracker Select Operator
• Two types of cracking algorithms based on select’s
where clause
• Traditional select operator
where X < #
Split a piece into
two new pieces
– Scans the column
– Returns a new column that contains the qualifying values
where # < X < #
Split a piece into
three new pieces
• The cracker select operator
–
–
–
–
Searches the cracker index
Physically re-organizes the pieces found
Updates the cracker index
Returns a slice of the cracker column as the result
• More steps but faster because analyzes less data
10/29/2014
cs262a-F14 Lecture-17
33
Cracking Example
10/29/2014
• Each query is treated as advice on how data should be stored
– Physically reorganize based on the selection predicate
where R.A > 10
and R.A < 14
10/29/2014
34
Cracking Example
• Each query is treated as advice on how data should be stored
Q1: select * from R
cs262a-F14 Lecture-17
– Physically reorganize based on the selection predicate
Column A
Cracker
Column A
13
16
4
2
9
7
2
1
12
3
7
8
1
6
1
6
19
13
19
13
3
12
3
12
14
11
14
11
11
16
11
16
8
19
8
19
6
14
6
14
cs262a-F14 Lecture-17
Column A
Cracker
Column A
4
13
4
9
16
9
4
2
9
7
2
1
12
3
7
8
Piece 1:
Q1: select * from R
A ≤ 10
where R.A > 10
and R.A < 14
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
35
10/29/2014
cs262a-F14 Lecture-17
Piece 1:
A ≤ 10
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
36
Cracking Example
•
Cracking Example
Improve data
access for
Each query is treated as advice on how data should be stored
future queries
• Each query is treated as advice on how data should be stored
– Physically reorganize based on the selection predicate
Column A
Cracker
Column A
13
4
16
9
4
2
9
7
Gain knowledge on
how to organize data
2
on-the-fly within
12
the select-operator
Q1: select * from R
where R.A > 10
and R.A < 14
10/29/2014
1
– Physically reorganize based on the selection predicate
Piece 1:
7
2
1
7
1
6
1
6
19
13
19
13
3
12
3
12
14
11
14
11
11
16
11
16
8
19
8
19
6
14
6
14
Q2: select * from R
Piece 2:
where R.A > 7
10 < A < 14
and R.A ≤16
Piece 3:
14 ≤ A
37
Q1: select * from R
13
4
4
where R.A > 10
16
9
2
4
2
9
7
2
1
Q1
10/29/2014
Q1
Piece 1:
A ≤ 10
3
8
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
cs262a-F14 Lecture-17
38
Cracking Example
Column A
10/29/2014
2
9
12
Cracker
Column A
and R.A ≤16
9
4
8
Cracker
Column A
where R.A > 7
4
16
3
– Physically reorganize based on the selection predicate
Q2: select * from R
13
where R.A > 10
7
• Each query is treated as advice on how data should be stored
7
Q1: select * from R
A ≤ 10
Cracking Example
12
Cracker
Column A
and R.A < 14
cs262a-F14 Lecture-17
and R.A < 14
Column A
3
8
1
6
19
13
3
12
14
11
11
16
8
19
6
14
cs262a-F14 Lecture-17
Piece 1:
A ≤ 10
Q2
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
1
3
6
• Each query is treated as advice on how data should be stored
– Physically reorganize based on the selection predicate
Piece 1:
Column A
Cracker
Column A
Cracker
Column A
Q1: select * from R
13
4
4
where R.A > 10
16
9
4
2
9
7
2
1
and R.A < 14
A≤7
7
12
Q1
3
9
Piece 2:
7
8
7 < A ≤10
1
6
19
13
3
12
13
12
11
Piece 3:
10<A<14
Q2: select * from R
where R.A > 7
and R.A ≤16
8
14
11
Piece 4:
11
16
16
14≤A≤16
8
19
19
Piece 5:
6
14
14
16 < A39
10/29/2014
cs262a-F14 Lecture-17
2
Piece 1:
A ≤ 10
Q2
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
1
3
6
Piece 1:
A≤7
7
9
Piece 2:
8
7 < A ≤10
13
12
11
14
Piece 3:
10<A<14
Piece 4:
16
14≤A≤16
19
Piece 5:
16 < A40
Cracking Example
•
Self-Organizing Behavior (Count(*) range query)
The more cracking,
Each query is treated as advice on how data should be stored
the more learned
– Physically reorganize based on the selection predicate
Column A
Cracker
Column A
Cracker
Column A
Q1: select * from R
13
4
4
where R.A > 10
16
9
4
2
9
7
2
1
and R.A < 14
12
where R.A > 7
and R.A ≤16
3
Q2
2
1
3
6
Piece 1:
A≤7
7
9
Piece 2:
1
6
8
7 < A ≤10
19
13
13
3
12
14
11
11
16
8
19
6
14
7
Q2: select * from R
Q1
Result
Piece 1:
A ≤ 10
tuples
10/29/2014
8
Piece 2:
10 < A < 14
Piece 3:
14 ≤ A
12
11
14
Piece 3:
10<A<14
Piece 4:
16
14≤A≤16
19
Piece 5:
cs262a-F14 Lecture-17
16 < A41
Self-Organizing Behavior (TPC-H Query 6)
42
• What were the authors’ goals?
• What about the evaluation/metrics?
• Did they convince you that this was a good
system/approach?
• Were there any red-flags?
• What mistakes did they make?
• Does the system/approach meet the “Test of Time”
challenge?
• How would you review this paper today?
– Business oriented ad-hoc queries, concurrent data modifications
• Example:
– Tell me the amount of revenue increase that would have resulted from eliminating
certain company-wide discounts in a given percentage range in a given year
• Workload:
– Database load
– Execution of 22 read-only
queries in both single
and multi-user mode
– Execution of 2 refresh
functions
cs262a-F14 Lecture-17
cs262a-F14 Lecture-17
Is this a good paper?
• TPC-H is an ad-hoc, decision support benchmark
10/29/2014
10/29/2014
43
10/29/2014
cs262a-F14 Lecture-17
44
Download