16.2.Algebraic Laws for Improving Query Plans 16.2 Algebraic Laws for Improving Query Plans • • • • • • • • 16.2.1 Commutative and Associative Laws 16.2.2 Laws Involving Selection 16.2.3 Pushing Selections 16.2.4 Laws Involving Projection 16.2.5 Laws About Joins and Products 16.2.6 Laws Involving Duplicate Elimination 16.2.7 Laws Involving Grouping and Aggregation 16.2.8 Exercises for Section 16.2 16.2.1 Commutative and Associative Laws Commutativity for Sets and Bags (Ch5): R x S = S x R (Proof) R S = S R (ch5 e) R U S = S U R(ch5) R ∩ S = S ∩ R(ch5) Associativity Sets and Bags: : (R x S) x T = R x (S x T) (RS) T = R (S T) (R U S) U T = R U (S U T)(ch5) (R ∩ S) ∩ T = R ∩ (S ∩ T)(ch5) 16.2.2 Laws Involving Selection • Selections reduce the size of relations. • To make efficient query, the selection must be moved down the tree without the changing what the expression does. • When the condition for the selection is complex, it helps to break the condition into its constituent parts. 16.2.2 Laws Involving Selection • first two laws for σ are the splitting laws, • σc1 AND c2 (R) = σc1(σc2(R)) • σc1 OR c2 (R) = (σc1(R)) s (σc2(R)) The second law for OR works only if the relation R is the set .If R is a bag, then the set union Us will eliminate the duplicates incorrectly. • σc1(σc2(R))= σc2(σc1(R)) 16.2.2 Laws Involving Selection Laws of selection with binary operators like product, union, intersection, difference, join. (3 laws) 1. For a union, the selection must be pushed to both arguments. 1. 2. σc (R U S) = σc (R) U σc (S) For a difference, the selection must be pushed to first argument and optionally to second. 1. 2. 3. σc (R - S) = σc (R) – S σc (R - S) = σc (R) - σc (S) it is only required that the selection must be pushed to one or both argument. – – – – σc(R x S) = σc (R) x S σc (R S) = σ (R) S σc (RD S) = σ (R) D S σc ( R ∩ S) = σc (R) ∩ S 16.2.2 Laws Involving Selection Laws of selection with binary operators like product, union, intersection, 3. it is only required that the selection must be pushed to one or both argument. – σc(R x S) = R x σc (S) – σc (R S) = σc (R) σc(S) 16.2.3 Pushing Selections • Pushing Selection down the expression tree( i.e replacing the left side of one of the rules by the right side )is one of the best method to optimize query. • An example for Pushing Selection is illustrated as follows 16.2.3 Pushing Selections • Suppose we have relations StarsIn(title ,year , starName) Movie(title ,year, length,inColor, studioName) • We Define a view Movies1996 as CREATE VIEW Movie1996 AS SELECT * FROM MOVIE WHERE year = 1996; Selection: C(R) SQL: Select from where * R C C involves computable formulas 10 Projection: L(R) Select from L R 11 16.2.3 Pushing Selections • The query to find out which stars worked in which studios in 1996 SELECT starName ,studioName FROM Movie1996 NATURAL JOIN StarsIn • The view is Movie1996 is defined by σ year = 1996 (Movie) Select MS.studioname, MS.starname From(Select M.title, M.year, M.length, M.inColor, 16.2.3a Pushing Selections π starName ,studioName σ year = 1996 StarsIn Movie π starName ,studioName (σ year = 1996(Movie) StarsIn )= π starName ,studioName (σ year = 1996(Movie StarsIn ) 16.2.3b Pushing Selections π starName ,studioName σ year = 1996 StarsIn Movie π starName ,studioName (σ year = 1996(Movie) StarsIn )= π starName ,studioName (σ year = 1996(Movie StarsIn ) 16.2.3c Pushing Selections π starName ,studioName σ year = 1996 σ year = 1996 Movie StarsIn π starName ,studioName (σ year π starName ,studioName (σ year StarsIn )= = 1996(Movie) σ year = 1996(StarsIn )) = 1996(Movie 16.2.4 Laws Involving Projection • Projection, like selection can be pushed down through many other operators • Pushing Projection usually involves introducing a new projection somewhere below an existing projection. • Projection differs from selection in the aspect that projection reduces the length of the tuples whereas selection reduces the number of the tuples 16.2.4. Laws involving Projection • Consider term π E x – E : attribute, or expression involving attributes and constants. – All attributes in E are input attributes of projection and x is output attribute • Simple projection: if a projection consists of only attributes. – Example: π a,b,c (R) is simple. a,b,c are input and output attributes. • Projection can be introduced anywhere in expression tree as long as it only eliminates attributes that are never used. 16.2.4. Laws involving Projection (cont….) • πL(R S) = πL(πM(R) πN(S)) ; M and N are all attributes of R and S that are either join (in schema of both R and S) or input attributes of L • πL(R c S) = πL(πM(R) c πN(S)) ; M and N are all attributes of R and S that are either join(mentioned in condition of C ) or input attributes of L • πL(R x S) = πL(πM(R) x πN(S)) ; M and N are all attributes of R and S that are input attributes of L Projections cannot be pushed below set unions or either of set or bag versions of intersection or difference at all. 16.2.4 Laws Involving Projection SELECT starName FROM StarsIn WHERE year = 1996 π starName σ movieYear = 1996 StarsIn Fig : Logical query plan for the above query We can introduce a projection in the above Figure 16.2.4 Laws Involving Projection π starName σ movieYear = 1996 π starName, movieYear StarsIn Convert the tree into relational algebra, then simplify as much as you can 16.2.5 Laws About Joins and Products • RCS= C(R S) • RS= L (C(R S)) Where C is the condition that equates each pair of atrribute from R and S with the same name, and L is the list that includes one attribute from each equted attributed and all other attributes of R and S. 16.2.6 Laws Involving Duplicate Elimination • The operator δ , which eliminates duplicates from a bag can be pushed through only some of the operators • Moving δ down the tree reduces the size of intermediate relation and may therefore be beneficial • In some cases, we can move δ to a position where it can be eliminated because it is applied to a relation that does not have any duplicates 16.2.6 Laws Involving Duplicate Elimination • δ( R ) = R if R has no duplicates Important cases of such a relation R include 1. A stored relation with a declared primary key 2. A relation that is the result of a γ operation ,since grouping creates a relation with no duplicates • δ cannot be moved across the operators like U , - , π. 16.2.6 Laws involving duplicate elimination Laws that “push” δ (delta) through other operator • δ(R x S) = δ(R) x δ(S) • δ(R S) = δ(R) δ(S) • δ(R c S) = δ(R) c δ(S) • δ( c(R)) = c(δ(R)) δ eliminates duplicates from a bag, but cannot be pushed through all the operators 16.2.7 Laws Involving Grouping and Aggregation • While using grouping and aggregation ,the applicability of many transformation depends on the details of the aggregation used. • Due to the above ,we cannot state laws in generality. • One exception is the law below that γ absorbs δ δ(γL(R)) = γL ( R ) 16.2.7 Laws Involving Grouping and Aggregation • We may project useless attributes prior to applying γ operation • γL ( R ) = γL(πM (R ) where M is the list containing at least all those attributes of R that are mentioned in L. Laws involving grouping and aggregation (cont…) • Some aggregations like MIN and MAX are not affected by presence or absence of duplicates • Others like SUM,COUNT,AVG produce different values if duplicates are eliminated prior to aggregation. 16.2.7 Laws Involving Grouping and Aggregation • Suppose we have the relation • MovieStar(name ,addr ,gender ,birthdate) • StarsIn(movieTitle ,movieYear ,starName) • Consider the query below • Select movieYear ,MAX(birthDate) • FROM MovieStar ,StarsIn • WHERE name = starName • GROUP BY movieYear 16.2.7 Laws Involving Grouping and Aggregation • The FROM list is expressed by a product and the WHERE clause by a selection above it. • The grouping and aggregation are expressed by the γ. • Combine the selection and product into an equijoin • Generate a δ below the γ ,since the γ is duplicate-impervious • Generate a π between the γ and the introduced δ to project onto movieYear and birthDate ,the only attributes relevant to the γ 16.2.7 Laws Involving Grouping and Aggregation γmovieYear ,MAX(birthDate) σname = starName MovieStar StarsIn 1. Use 16.2.5. (and following 2 reasons) we can rewrite the tree 2. There is no duplication in output (because γ), we can add 3. By projection law We can add . 16.2.7 Laws Involving Grouping and Aggregation γmovieYear ,MAX(birthDate) movieYear ,birthDate δ name = starName MovieStar Figure : Second query plan StarsIn 16.2.7 Laws Involving Grouping and Aggregation γ movieYear ,MAX(birthDate) π movieYear ,birthDate name = starName δ πbirthDate,name MovieStar δ πbirthDate,name StarsIn Figure : Third query plan can be push down 16.2.7a Additional Example • From DB1 16.2.7a Laws Involving Grouping and Aggregation • SELECT • FROM • GROUP PNUM, SUM(QTY) SHIPMENTs, Parts BY PNAME; 16.2.7b Laws Involving Grouping and Aggregation γpname ,SUM(qty) sum σpnum = pnum Shipments (Sh) Parts(P) Figure : Initial Logical query plan for the query γpname ,SUM(qty) sum(σsh.pnum=p.pnum (ShipmentsParts)) 16.2.7c Laws Involving Grouping and Aggregation γpname ,SUM(qty)sum pname ,QTY δ Shipments Parts γpname ,SUM(qty)sum ( pname.qty (Shipments Parts)) 16.2.7d Laws Involving Grouping and Aggregation γpname ,SUM(qty)sum π pname ,qty δ πqty,pnum Shipments δ pnum, pname Parts γpname ,Sum(qty) sum ( pname.qty (( pname.qty (Shipments)) ( pname.qty (Parts)))) 16.2.8 Exercises for Section 16.2