C-Store: A Column-oriented DBMS By New England Database Group 1

C-Store: A Column-oriented DBMS By New England Database Group 1 Current DBMS Gold Standard Store fields in one record contiguously on disk Use B-tree indexing Use small (e.g. 4K) disk blocks Align fields on byte or word boundaries Conventional (row-oriented) query optimizer and executor (technology from 1979) Aries-style transactions M.I.T 2 Terminology -- “Row Store” Record 1 Record 2 Record 3 Record 4 E.g. DB2, Oracle, Sybase, SQLServer, … M.I.T 3 Row Stores are Write Optimized Can insert and delete a record in one physical write Good But for on-line transaction processing (OLTP) not for read mostly applications  Data warehouses  CRM M.I.T 4 Elephants Have Extended Row Stores With Bitmap indices Better sequential read Integration of “datacube” products Materialized views But there may be a better idea……. M.I.T 5 Column Stores M.I.T 6 At 100K Feet…. Ad-hoc queries read 2 columns out of 20 In a very large warehouse, Fact table is rarely clustered correctly Column store reads 10% of what a row store reads M.I.T 7 C-Store (Column Store) Project Brandeis/Brown/MIT/UMass-Boston project  Usual suspects participating  Enough coded to get performance numbers for some queries  Complete status later M.I.T 8 We Build on Previous Pioneering Work…. Sybase IQ (early ’90s) Monet (see CIDR ’05 for the most recent description) M.I.T 9 C-Store Technical Ideas Code No the columns to save space Big alignment disk blocks Only materialized views (perhaps many) Focus on Sorting not indexing Automatic physical DBMS design M.I.T 10 C-store (Column Store) Technical Ideas Optimize for grid computing Innovative Xacts Data redundancy – but no need for Mohan ordered on anything, Not just time Column optimizer and executor M.I.T 11 How to Evaluate This Paper…. None of the ideas in isolation merit publication Judge the complete system by its (hopefully intelligent) choice of  Small collection of inter-related powerful ideas  That together put performance in a new sandbox M.I.T 12 Code the Columns Work hard to shrink space  Use extra space for multiple orders Fundamentally  E.g. easier than in a row store RLE works well M.I.T 13 No Alignment Densepack  E.g. columns a 5 bit field takes 5 bits Current CPU speed going up faster than disk bandwidth  Faster to shift data in CPU than to waste disk bandwidth M.I.T 14 Big Disk Blocks Tunable Big (minimum size is 64K) M.I.T 15 Only Materialized Views Projection (materialized view) is some number of columns from a fact table columns in a dimension table – with a 1-n join between Fact and Dimension table Plus Stored in order of a storage key(s) Several may be stored!!!!! With a permutation, if necessary, to map between them M.I.T 16 Only Materialized Views Table (as the user specified it and sees it) is not stored! No secondary indexes (they are a one column sorted MV plus a permutation, if you really want one) M.I.T 17 Example User view: EMP (name, age, salary, dept) Dept (dname, floor) Possible set of MVs: MV-1 (name, dept, floor) in floor order MV-2 (salary, age) in age order MV-3 (dname, salary, name) in salary order M.I.T 18 Different Indexing Sequential Few values Many values RLE encoded Conventional B-tree at the value level Delta encoded Conventional B-tree at the block level Non sequential Bitmap per value Conventional Gzip Conventional B-tree at the block level M.I.T 19 Automatic Physical DBMS Design Not enough 4-star wizards to go around Accept a “training set” of queries and a space budget Choose the MVs auto-magically Re-optimize periodically based on a log of the interactions M.I.T 20 Optimize for Grid Computing I.e. shared-nothing  Dewitt (Gamma) was right Horizontal partitioning and intra-query parallelism as in Gamma M.I.T 21 Innovative Redundancy Hardly any warehouse is recovered by a redo from the log  Takes too long! Store enough MVs at enough places to ensure K-safety Rebuild dead objects from elsewhere in the network K-safety is a DBMS-design problem! M.I.T 22 XACTS – No Mohan Undo from a log (that does not need to be persistent) Redo by rebuild from elsewhere in the network M.I.T 23 XACTS – No Mohan Snapshot isolation (run queries as of a tunable time in the recent past)  To solve read-write conflicts Distributed Xacts  Without a prepare message (no 2 phase commit) M.I.T 24 Storage (sort) Key(s) is not Necessarily Time That would be too limiting So how to do fast updates to densepack column storage that is not in entry sequence? M.I.T 25 Solution – a Hybrid Store Write-optimized Column store Tuple mover Read-optimized Column store (Much like Monet) (Batch rebuilder) (What we have been talking about so far) M.I.T 26 Column Executor Column operations – not row operations Columns Late remain coded – if possible materialization of columns M.I.T 27 Column Optimizer Chooses  Most Build MVs on which to run the query important task in snowflake schemas  Which are simple to optimize without exhaustive search Looking at extensions M.I.T 28 Current Performance 100X popular row store in 40% of the space 10X popular column store in 70% of the space 7X popular row store in 1/6th of the space Code available with BSD license M.I.T 29 Structure Going Forward Vertica  Very well financed start-up to commercialize C-store  Doing the heavy lifting University Research  Funded by Vertica M.I.T 30 Vertica Complete alpha system in December ‘05  Everything,  With including DBMS designer current performance!  Looking for early customers to work with (see me if you are interested) M.I.T 31 University Research Extension of algorithms to non-snowflake schemas Study of L2 cache performance Study of coding strategies Study of executor options Study of recovery tactics Non-cursor Study interface of optimizer primitives M.I.T 32

C-Store: A Column-oriented DBMS By New England Database Group 1

Related documents

Products

Support

C-Store: A Column-oriented DBMS By New England Database Group 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib