Outline – Query Optimization CSE 562 Database Systems • Overview • Relational algebra level – Algebraic Transformations • Detailed query plan level Query Processing: Algebraic Optimization – Estimate Costs – Estimating size of results – Estimating # of IOs – Generate and compare plans Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall 2nd Edition ©2008 Garcia-Molina, Ullman, and Widom UB CSE 562 2 UB CSE 562 Algebraic Rewritings: Commutative and Associative Laws Relational Algebra Optimization Commutative • Transformation rules (preserve equivalence) • What are good transformations? Cartesian Product × R Associative S S × × × R R S Natural Join R S S R T × × R T S T R S T R S • Question 1: Do the above hold for both sets and bags? • Question 2: Do commutative and associative laws hold for arbitrary Theta Joins? UB CSE 562 3 UB CSE 562 4 1 Algebraic Rewritings: Commutative and Associative Laws Commutative Union ∪ R Associative S ∪ ∪ ∪ S R R S S T R R ∩ ∩ ∩ R S T S ∩ ∩ ∩ ∪ ∪ R S Intersection Algebraic Rewritings for Selection: Decomposition of Logical Connectives T R T ∪ S Does it apply to bags? • Question 1: Do the above hold for both sets and bags? • Question 2: Is difference commutative and associative? 5 UB CSE 562 Algebraic Rewritings for Selection: Decomposition of Negation Question 6 UB CSE 562 Pushing Selection Through Binary Operators: Union and Difference Complete ∪ Union ∪ – – Difference – • Exercise: Do the rules for intersection UB CSE 562 7 UB CSE 562 8 2 Pushing Selection Through Cartesian Product and Join Rules: π + σ combined Let X = subset of R attributes Z = attributes in predicate P (subset of R attributes) The right direction requires that cond refers to S attributes only The right direction refers to requires that cond S attributes only The right direction requires that all th e attribut by cond ap es used pear in bo th R and S • Exercise: Do the rule for theta join 9 UB CSE 562 Pushing Simple Projections Through Binary Operators: Union Pushing Simple Projections Through Binary Operators: Join and Product • A projection is simple if it only consists of an attribute list • B is the list of R attributes that appear in A • Similar for C ∪ ∪ Union • Question: What is B and C ? • Question 1: Does the above hold for both bags and sets? • Question 2: Can projection be pushed below intersection and difference? • Answer for both bags and sets UB CSE 562 10 UB CSE 562 • Exercise: Write the rewriting rule that pushes projection below theta join 11 UB CSE 562 12 3 Rules: π + σ + combined Projection Decomposition • Let X = set of attributes • Y = set of attributes • XY = X U Y • Z’ = Z U {attributes used in cond} UB CSE 562 13 14 UB CSE 562 Some Rewriting Rules Related to Aggregation: SUM Projection Decomposition • σcond[SUMGroupbyList;GroupedAttribute→ResultAttribute(R)] ⇔ SUMGroupbyList;GroupedAttribute→ResultAttribute[σcond(R)] if cond involves only the GroupbyList • SUMGL;GA→RA(R ∪ S) ⇔ PLUSRA1,RA2:RA[(SUMGL;GA→RA1R) (SUMGL;GA→RA2 S)] • SUMGL2;RA1→RA2[SUMGL1;GA→RA1(R)] ⇔ SUMGL2:GA→RA2(R) • Question: does the above hold for both bags and sets? UB CSE 562 15 UB CSE 562 16 4 Derived Rules: σ + combined Derivation for first one More Rules can be Derived: σp∧q (R S) = σp[σq (R S)] = σp[R σq (S)] = [σp (R)] [σq (S)] σp∧q (R S) = [σp (R)] [σq (S)] σp∧q∧m (R S) = σm[σp (R) σq (S)] σpvq (R S) = [σp (R) S] U [R σq (S)] • p only at R • q only at S • m at both R and S 17 UB CSE 562 σp1∧p2 (R) → σp1[σp2 (R)] • σp (R • • UB CSE 562 18 Conventional Wisdom: Do Projects Early Which are “good” transformations? • UB CSE 562 [σp (R)] S R S → S R πx[σp (R)] → πx{σp[πxz (R)]} S) → 19 UB CSE 562 20 5 But… More Transformations in Textbook What if we have A, B indexes? B = “cat” • Eliminate common sub-expressions • Other operations: duplicate elimination A=3 Intersect pointers to get pointers to matching tuples 21 UB CSE 562 UB CSE 562 22 Bottom line • No transformation is always good at the logical query plan level • Usually good: – early selections – elimination of Cartesian products – elimination of redundant sub-expressions • Many transformations lead to “promising” plans – Commuting/rearranging joins – In practice too “combinatorially explosive” to be handled as rewriting of logical query plan UB CSE 562 23 6