Query Processing General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design Indexing Query Processing and Optimization Review Data Retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100 <= ssn <=200) Index choices: Primary vs secondary, dense vs sparse, ISAM vs B+tree vs Extendible Hashing vs Linear Hashing But what about join queries? Or other queries not directly supported by the indices? How do we evaluate these queries? Sometimes, indexes not useful, even for SELECTION queries. When? What decides when to use them? A: Query Processing (one of the most complex components of a database system) QP & O SQL Query Query Processor Data: result of the query QP & O SQL Query Query Processor Parser Algebraic Expression Query Optimizer Execution plan Evaluator Data: result of the query QP & O Algebraic Representation Query Rewriter Query Optimizer Algebraic Representation Plan Generator Query Execution Plan Data Stats Query Processing and Optimization Parser / translator (1st step) Input: SQL Query Output: Algebraic representation of query (relational algebra expression) balance( Eg SELECT balance FROM balance2500(account)) account or WHERE balance < 2500 balance balance2500 Relational Algebra Tree account QP & O Plan Evaluator (last step) Input: Query Execution Plan Output: Data (Query results) Query execution plan Algorithms of operators that read from disk: Sequential scan Index scan Merge-sort join Nested loop join ….. SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 psname ( bid=100 and rating >5 (Reserves RA Tree: Plan: sname bid=100 rating > 5 Sailors)) (On-the-fly) sname bid=100 rating > 5 (Simple Nested Loops) sid=sid sid=sid Reserves (On-the-fly) Sailors Reserves Sailors QP & O Query Rewriting Input: Algebraic representation of query Output: Algebraic representation of query Idea: Apply heuristics to generate equivalent expression that is likely to lead to a better plan e.g.: amount > 2500 (borrower borrower loan) (amount > 2500(loan)) Why is 2nd better than 1st? QP & O Plan Generator Input: Algebraic representation of query Output: Query execution plan Idea: 1) generate alternative plans on evaluating a query Sequential scan amount > 2500 Index scan 1) Estimate cost for each plan 2) Choose the plan with the lowest cost Alternative Plans (On-the-fly) sname (On-the-fly) sname rating > 5 (On-the-fly) (Sort-Merge Join) sid=sid (Scan; write to bid=100 temp T1) sid=sid rating > 5 Reserves (Scan; write to temp T2) Sailors Plan 1 Query Rewriting (Use hash index; do not write result to temp) bid=100 (Index Nested Loops, with pipelining ) Sailors Reserves Plan 2 QP & O Goal: generate plan with minimum cost (i.e., as fast as possible) Cost factors: 1. CPU time (trivial compared to disk time) 2. Disk access time main cost in most DBs 3. Network latency Main concern in distributed DBs Our metric: count disk accesses Cost Model How do we predict the cost of a plan? Ans: Cost model For each plan operator and each algorithm we have a cost formula Inputs to formulas depend on relations, attributes Database maintains statistics about relations for this (Metadata) Statistics and Catalogs Need information about the relations and indexes involved. Catalogs typically contain at least: # tuples (NTuples) and # pages (NPages) for each relation. # distinct key values (NKeys) and NPages for each index. Index height, low/high key values (Low/High) for each tree index. Catalogs updated periodically. Updating whenever data changes is too expensive; lots of approximation anyway, so slight inconsistency ok. More detailed information (e.g., histograms of the values in some field) are sometimes stored. Metadata Given a relation r, DBMS likely maintains the following metadata: 1. Size (# of tuples) nr 2. Size (# of blocks) br 3. Block size (#tuples) fr (typically br = nr / fr ) 4. Tuple size (in bytes) sr 5. Attribute Variance (for each attribute r, # of different values) V(att, r) 6. Selection Cardinality (for each attribute in r, expected size of a selection: att = K (r ) ) SC(att, r) Example account bname Dntn Mianus Perry R.H. Dntn Perry acct_no A-101 A-215 A-102 A-305 A-200 A-301 balance 500 700 500 900 700 500 naccount = 6 saccount = 33 bytes faccount = 4K/33 V(balance, account) = 3 V(acct_no, account) = 6 S(balance, account) = 2 ( nr / V(att, r)) Some typical plans and their costs Query: att = K (r ) A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate (number of disk blocks scanned) = br br denotes number of blocks containing records from relation r If selection is on a key attribute, cost = (br /2) stop on finding record (on the average in the middle of the file) Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices Selection Operation (Cont.) Query: att = K (r ) A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered. Requires that the blocks of a relation are stored contiguously Cost estimate: log2(br) — cost of locating the first tuple by a binary search on the blocks Plus number of blocks containing records that satisfy selection condition EA2 = log2(br) + sc(att, r) / fr -1 What is the cost if att is a key? EA2 = log2(br) Example Query: bname =“Kenmore” ( account ) V(bname, account) = 50 naccount = 10K faccount = 20 tuples/block Primary index on bname Key: acct_no Cost Estimates: A1: EA1 = naccount / faccount = 500 I/O’s A2: EA2 = log2(br) + sc(att, r) / fr -1 = 9 + 9 = 18 I/O’s More Plans for selection What if there is an index on att? We need metadata on size of index (i). DBMS keeps that of: 1. Index height: HTi 2. Index “Fan Out”: fi Average # of children per node (not same as order..) 3. Index leaf nodes: LBi Note: HTi ~ logfi(LBi) + 1 example: LBi = 64, fi=4 More Plans for selection Query: att = K (r ) A3: Index scan, Primary Index What: Follow primary index, searching for key K Prereq: Primary index on att, i Cost: EA3 = HTi + 1, if att is a candidate key EA3 = HTi + SC(att, r) / fr, if not A4: Index scan, Secondary Index What: Follow according index, searching for key K Prereq: Secondary index on att, i Cost: bucket read if att not a key: EA4 = HTi + 1 + SC(att, r) Index block reads Else, if att is a key: EA4 = HTi + 1 File block reads (in worst case, each tuple on different block) HTi ... k, ... ... k k k k ... Cardinalities Cardinality: the size (number of tuples) in the query result Why do we care? Ans: Cost of every plan depends on e.g. Linear scan: nr br = nr / fr Primary Index: HTi +1 ~ logfi(LBi) +2 ≤ logfi(nr / fr )+2 But, what if r is the result of another query? Must know the size of query results as well as cost Size of att = K (r )? SC(att, r) Query: att = K (r ) A4: Index scan, Secondary Index What: Follow according index, searching for key K Prereq: Secondary index on att, i Cost: bucket read if att not a key: EA4 = HTi + 1 + SC(att, r) Index block reads Else, if att is a key: EA4 = HTi + 1 File block reads (in worst case, each tuple on different block) HTi ... k, ... ... k k k k ... Selections Involving Comparisons Query: Att K (r ) A5 (primary index, comparison). (Relation is sorted on Att) For Att K(r) use index to find first tuple v and scan relation sequentially from there For AttK (r) just scan relation sequentially till first tuple > v; do not use index Cost: EA5 =HTi + c / fr (where c is the cardinality of result) HTi k ... k Query: Att K (r ) Cardinality: More metadata on r are needed: min (att, r) : minimum value of att in r max(att, r): maximum value of att in r Then the selectivity of Att = K (r ) is estimated as: max( attr , r ) K (or nr /2 if n r max( att , r ) min( att , r ) min, max unknown) Intuition: assume uniform distribution of values between min and max min(attr, r) K max(attr, r) Plan generation: Range Queries A6: (secondary index, comparison). Att K (r ) HTi k, k+1 ... ... k k+m k+m k+1 Cost: EA6 = HTi -1+ #of leaf nodes to read + # of file blocks to read = HTi -1+ LBi * (c / nr) + c, if att is a candidate key Plan generation: Range Queries A6: (secondary index, range query). If att is NOT a candidate key HTi k, k+1 ... k+m ... ... k k+1 k+m k ... Cost: EA6 = HTi -1+ #of leaf nodes to read + #of file blocks to read +#buckets to read = HTi -1+ LBi * (c / nr) + c + x