Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15 Administrivia • Homework 2 Due Tonight – Remember you have 4 slip days for the course • Homeworks 3 & 4 available later this week – 3 is written assignment, deals with optimization, due before midterm 2 – 4 is programming assignment, implementing query processing, due after Spring Break • Midterm 2 is 3/22, 2 weeks from Thursday Review: Query Processing • Queries start out as SQL • Database translates SQL to one or more Relational Algebra plans • Plan is a tree of operations, with access path for each • Access path is how each operator gets tuples – If working directly on table, can use scan, index – Some operators, like sort-merge join, or group-by, need tuples sorted – Often, operators pipelined, getting tuples that are output from earlier operators in the tree • Database estimates cost for various plans, chooses least expensive Today • Review costs for: – Sorting – Selection – Projection – Joins • Re-examine Hashing for: – Projection – Joins General External Merge Sort More than 3 buffer pages. How can we utilize them? • To sort a file with N pages using B buffer pages: – Pass 0: use B buffer pages. Produce N / B sorted runs of B pages each. – Pass 1, 2, …, etc.: merge B-1 runs. INPUT 1 ... INPUT 2 ... OUTPUT ... INPUT B-1 Disk B Main memory buffers Disk Cost of External Merge Sort • Minimum amount of memory: 3 pages – Initial runs of 3 pages – Then 2-way merge of sorted runs (2 pages for inputs, one for outputs) – #of passes: 1 + log2(N/3) • With more memory, fewer passes – With B pages, #of passes: 1 + log(B-1)(N/B) • I/O Cost = 2N * (# of passes) External Sort Example • E.g., with 5 buffer pages, to sort 108 page file: – Pass 0: • 22 sorted runs of 5 pages each (last run is only 3 pages) • Now, do four-way (B-1) merges – Pass 1: • 6 sorted runs of 20 pages each (last run is only 8 pages) – Pass 2: • 2 sorted runs, 80 pages and 28 pages – Pass 3: • Sorted file of 108 pages Using B+ Trees for Sorting • Scenario: – Table to be sorted has B+ tree index on sorting column(s). • Idea: – Retrieve records in order by traversing leaf pages. • Is this a good idea? Cases to consider: – B+ tree is clustered Good idea! – B+ tree is not clustered Could be a very bad idea! • I/O Cost – Clustered tree: ~ 1.5N – Unclustered tree: 1 I/O per tuple, worst case! Selections: “age < 20”, “fname = Bob”, etc • No index – Do sequential scan over all tuples – Cost: N I/Os • Sorted data – Do binary search – Cost: log2(N) I/Os • Clustered B-Tree – Cost: 2 or 3 to find first record + 1 I/O for each #qualifying pages • Unclustered B-Tree – Cost: 2 or 3 to find first RID + ~1 I/O for each qualifying tuple • Clustered Hash Index – Cost: ~1.2 I/Os to find bucket, all tuples inside • Unclustered Hash Index – Cost: ~1.2 I/Os to find bucket, + ~1 I/O for each matching tuple Selection Exercise • Sal > 100 – Sequential Scan: 100 I/Os – Btree on sal? Unclustered, 503 I/Os – BTree on <age, sal>? Can’t use it • Age = 25 – Sequential Scan: 100 I/Os – Hash on age: 1.2 I/O to get to bucket • 20 matching tuples, 1 I/O for each – BTree <age,sal>: ~3 I/Os to find leaf + #matching pages • 20 matching tuples, clustered, ~ 2 I/Os • Age > 20 – Sequential Scan: 100 I/Os – Hash on age? Not with range query. – BTree <age, sal>: ~3 I/Os to find leaf + #matching pages • (Age > 20) is 90% of pages, or ~90*1.5 = 135 I/Os Selection Exercise (cont) • Eid = 1000 – Sequential Scan: ~50 I/Os (avg) – Hash on eid: ~1.2 I/Os to find bucket, 1 I/O to get record • Sal > 100 and age < 30 – Sequential Scan: 100 I/Os – Btree <age, sal>: ~ 3 I/Os to find leaf, 30% of pages match, so 30*1.5 = 45 I/Os Projection • Expensive when eliminating duplicates • Can do this via: – Sorting: cost no more than external sort • Cheaper if you project columns in initial pass, since more projected tuples fit in each page. – Hashing: build a hash table, duplicates will end up in the same bucket An Alternative to Sorting: Remove duplicates with Hashing! • Idea: – Many of the things we use sort for don’t exploit the order of the sorted data – e.g.: removing duplicates in DISTINCT – e.g.: finding matches in JOIN • Often good enough to match all tuples with equal values • Hashing does this! – And may be cheaper than sorting! (Hmmm…!) – But how to do it for data sets bigger than memory?? General Idea • Two phases: – Partition: use a hash function h to split tuples into partitions on disk. • Key property: all matches live in the same partition. – ReHash: for each partition on disk, build a mainmemory hash table using a hash function h2 Original Relation OUTPUT 1 Two Phases • Partition: Partitions 1 2 INPUT 2 hash function ... h B-1 B-1 Disk • Rehash: B main memory buffers Result Partitions hash fn Hash table for partition Ri (<= B pages) h2 Disk Disk B main memory buffers Original Relation Duplicate Elimination using Hashing OUTPUT 1 Partitions 1 2 INPUT 2 hash function ... h B-1 B-1 Disk B main memory buffers Result Partitions • read one bucket at a time • for each group of identical tuples, output one hash fn Hash table for partition Ri (<= B pages) h2 Disk Disk B main memory buffers Hashing in Detail • Two phases, two hash functions • First pass: partition into (B-1) buckets • E.g., B = 5 pages, h(x) is two low order bit Input File 6 2 1 9 4 1 Memory 5 6 1 Output Files Memory, I/O costs Requirement • If we can hash in two passes -> cost is 4N • How big of a table can we hash in two passes? – B-1 “partitions” result from Phase 0 – Each should be no more than B pages in size – Answer: B(B-1). N Said differently: We can hash a table of size N pages in about space – Note: assumes hash function distributes records evenly! • Have a bigger table? Recursive partitioning! How does this compare with external sorting? Memory Requirement for External Sorting • How big of a table can we sort in two passes? – Each “sorted run” after Phase 0 is of size B – Can merge up to B-1 sorted runs in Phase 1 – Answer: B(B-1). Said differently: We can sort a table of size N pages in about N space • Have a bigger table? Additional merge passes! So which is better ?? • Based on our simple analysis: – Same memory requirement for 2 passes – Same IO cost • Digging deeper … • Sorting pros: – Great if input already sorted (or almost sorted) – Great if need output to be sorted anyway – Not sensitive to “data skew” or “bad” hash functions • Hashing pros: – Highly parallelizable (will discuss later in semester) – Can exploit extra memory to reduce # IOs (stay tuned…) Nested Loops Joins • R, with M pages, joins S, with N Pages • Nested Loops – Simple nested loops • Insanely inefficient M + PR*M*n – Paged nested loops – only 3 pages of memory • M + M*N – Blocked nested loops – B pages of memory • M + M/(B-2) * N • If M fits in memory (B-2), cost only M + N – Index nested loops • M + PR*M* index cost • Only good in M very small Sort-Merge Join • Simple case: – sort both tables on join column – Merge – Cost: external sort cost + merge cost • 2M*(1 + log(B-1)(M/B)) + 2N*(1 + log(B-1)(N/B)) + M + N • Optimized Case: – If we have enough memory, do final merge and join in same pass. This avoids final write pass from sort, and read pass from merge – Can we merge on 2nd pass? Only in #runs from 1st pass < B – #runs for R is M/B. #runs for S is N/B. • Total #runs ~~ (M+N)/B – Can merge on 2nd pass if M+N/B < B, or M+N < B2 – Cost: 3(M+N) Hash Join Original Relation OUTPUT 1 Partitions 1 2 INPUT 2 hash function ... h B-1 B-1 Disk B main memory buffers Partitions of R & S Disk Join Result hash fn Hash table for partition Ri (B-2 pages) h2 h2 Input buffer for Si Disk Output buffer B main memory buffers Disk Cost of Hash Join • Partitioning phase: read+write both relations 2(|R|+|S|) I/Os • Matching phase: read+write both relations |R|+|S| I/Os • Total cost of 2-pass hash join = 3(|R|+|S|) Q: what is cost of 2-pass merge-sort join? Q: how much memory needed for 2-pass sort join? Q: how much memory needed for 2-pass hash join? An important optimization to hashing • Have B memory buffers • Want to hash relation of size N cost # passes 3N 2 N 1 B B2 N If B < N < B2, will have unused memory … Hybrid Hashing • Idea: keep one of the hash buckets in memory! Original Relation k-buffer hashtable OUTPUT 2 1 ... INPUT Partitions 2 3 h3 h 3 B-k B-k Disk B main memory buffers Disk Q: how do we choose the value of k? Cost reduction due to hybrid hashing • Now: cost # passes 3N 2 N 1 B B2 N Summary: Hashing vs. Sorting • Sorting pros: – Good if input already sorted, or need output sorted – Not sensitive to data skew or bad hash functions • Hashing pros: – Often cheaper due to hybrid hashing – For join: # passes depends on size of smaller relation – Highly parallelizable Summary • Several alternative evaluation algorithms for each operator. • Query evaluated by converting to a tree of operators and evaluating the operators in the tree. • Must understand query optimization in order to fully understand the performance impact of a given database design (relations, indexes) on a workload (set of queries). • Two parts to optimizing a query: – Consider a set of alternative plans. • Must prune search space; typically, left-deep plans only. – Must estimate cost of each plan that is considered. • Must estimate size of result and cost for each plan node. • Key issues: Statistics, indexes, operator implementations.