Relational Calculus Chapter 4 – Part II 1 Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: • how to compute it , operational, Representing execution plans (inside) Relational Calculus: • what you want, declarative. Influent SQL design (outside) 2 Relational Calculus Calculus has • • • • variables, constants, comparison ops (<,>,=,,.), logical connectives ( ,) Quantifiers (,) Comes in two flavours: • • Tuple relational calculus (TRC) : Variables range over tuples. (SQL) Domain relational calculus (DRC). Variables range over domain elements. (QBE) Both formula in TRC and DRC are simple subsets of firstorder logic. 3 Tuple Relational Calculus (TRC) Tuple variable: variable that takes on tuples of a relation schema as values. Query form: T| pT T = tuple variable p(T) = formula that describes T Result = set of all tuples t for which p(t) with T=t evaluates to be true Formula P(T) specified in First Order Logic 4 Find all sailors with a rating above 7 S|S Sailors S.rating7 Variable S bound to a tuple in relation Sailors 5 TRC Formulas Formula: an atomic formula R Rel, R.a op S.b, R.a op constant, costant op R.a Or recursively defined p, pq, pq, pq , where p, q are formulas R(p(R)), where R is a tuple variable R(p(R)), where R is a tuple variable Terms. bind --- the quantifier and are bind to variable R. free --- a variable is free if no quantifier binds to it in a formula 6 Semantics of TRC Queries F is an atomic formula R Rel, and R is assigned a tuple in the instance of relation Rel {S|SSailors} F is a comparison R.a op S.b, R.a op constant, or constant op R.a, and the tuples assigned to R and S have field names R.a and S.b that make the comparison true. F is of the form p, and p is not true; the form p q, and both p and q are true; the form p q, and one of them is true; the form p q, and q is true whenever p is true. F is of the form R(p( R )), and there is some assignment of tuples to the free variables in p ( R ), including the variable R that make the formula p( R ) true; F is of the form R (p ( R )), and there is some assignment of tuples to the free variables in p( R ) that makes the formula p( R ) true no matter what tuple is assigned to R. 7 TRC Example Reserves sid bid day 22 101 10/10/96 58 103 11/12/96 Sailors sid 22 31 58 sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 8 Find the names and ages of sailors with rating > 7 { P | S Sailors (S.rating > 7 P.name = S.sname P.age = S.age)} • P is a Tuple variable with two fields: name and age. - the only fields mentioned in P - P does not range over any relation in the query. Sailors sid 22 31 58 sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 name age Lubber 55.0 rusty 35.0 9 Find the names of the sailors who reserved boat 103 { P | S Sailors R Reserves (R.sid = S.sid R.bid = 103 P.sname = S.sname)} Retrieve all sailor tuples for which there exists a tuple in Reserves having the same value in the sid field and with bid =103 Question: How is the answer tuple looks like? (columns? Rows?) What is difference? { S | S Sailors R Reserves (R.sid = S.sid R.bid = 103)} 10 Find the names of the sailors who have reserved a red boat { P | S Sailors R Reserves B Boats (R.sid = S.sid B.bid = R.bid B.color ='red' P.sname = S.sname)} Retrieve all sailor tuples S for which there exists tuples R in Reserves and B in Boats such that R.sid = S.sid B.bid = R.bid B.color ='red' { P | S Sailors R Reserves (R.sid = S.sid P.sname = S.sname B Boats(B.bid = R.bid B.color ='red'))} 11 Find the names of the sailors who have reserved all boat { P | S Sailors B Boats (R Reserves (S.sid = R.sid R.bid = B.bid P.sname = S.sname))} Find sailors S such that for all boats B there is a Reserves tuple showing that sailor S has reserved boat B. (Tempsids, ( sid, bid Re serves) / ( bid Boats)) sname (Tempsids Sailors) 12 Find the sailors who have reserved all red boat { P | S Sailors B Boats (B.color = 'red' (R Reserves (S.sid = R.sid R.bid = B.bid))} Question: How to change the algebraic representation? (Tempsids, ( sid, bid Re serves) / ( bid Boats)) sname (Tempsids Sailors) 13 Find the sailors who have reserved all red boat Logically p q is equivalent to pq (Why?) { P | S Sailors B Boats (B.color = 'red' (R Reserves (S.sid = R.sid R.bid = B.bid))} is rewritten as: { P | S Sailors B Boats (B.color 'red' (R Reserves (S.sid = R.sid R.bid = B.bid))} to restrict attention to those red boat. 14 Domain Relational Calculus Query has the form: x1, x2,..., xn | p x1, x2,..., xn Answer includes all tuples x1, x2,..., xn that make the formula p x1, x2,..., xn be true. 15 DRC Formulas Atomic formula: <x1,x2,…,xn> Rel , where Rel is a relation with n variables X op Y X op constant op is one of comparison ops (<,>,=,,.), Or recursively defined p, pq, pq, pq , where p, q are formulas X(p(X)), where X is a domain variable X(p(X)), where X is a domain variable 16 Free and Bound Variables The use of quantifiers X and X in a formula is said to bind X. A variable that is not bound is free. Let us revisit the definition of a query: x1, x2,..., xn | p x1, x2,..., xn There is an important restriction: the variables x1, ..., xn that appear to the left of `|’ must be the only free variables in the formula p(...). 17 Find all sailors with a rating above 7 I, N,T , A | I, N, T, A Sailors T 7 Sailors sid 22 31 58 sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 I, N, T , A Sailors the domain variables I, N, T and A are bound to fields of the same Sailors tuple. that every tuple I, N, T , A that satisfies T>7 is in the answer. Question: Modify to answer: Find sailors who are older than 18 or have a rating under 9, and are called ‘Joe’. 18 Find sailors rating > 7 who’ve reserved boat #103 I, N, T, A | I, N, T , A Sailors T 7 Ir, Br, D Ir, Br, D Re serves Ir I Br 103 Sailors sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 Boats Reserves sid 22 31 58 sid 22 58 bid 101 103 day 10/10/96 11/12/96 bid bname color 101 interlake red 103 marine green We have used Ir , Br , D . . . as a shorthand for Ir Br D . . . Note the use of to find a tuple in Reserves that `joins with’ the Sailors tuple under consideration. {<I,N,T,A>|<I,N,T,A> Sailors T>7 Ir,Br,D (<Ir,103,D> Reserves Ir =I )} 19 Find sailors rated > 7 who’ve reserved a red boat I, N, T , A | I, N, T , A Sailors T 7 Ir, Br, D Ir, Br, D Re serves Ir I B, BN, C B, BN, C Boats B Br C ' red ' Sailors Boats Reserves sid 22 31 58 sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 sid 22 58 bid 101 103 day 10/10/96 11/12/96 bid bname color 101 interlake red 103 marine green Observe how the parentheses control the scope of each quantifier’s binding. Rewrite using 'red' as constant • BN = red 20 Find sailors who’ve reserved all boats I, N, T , A | I, N, T , A Sailors B, BN, C Boats Ir, Br, D Re serves I Ir Br B Sailors Boats Reserves sid 22 31 58 sname rating age dustin 7 45.0 lubber 8 55.5 rusty 10 35.0 sid 22 58 bid 101 103 day 10/10/96 11/12/96 bid bname color 101 interlake red 103 marine green Find all sailors I such that for each 3-tuple {B, BN, C} either it is not a tuple in Boats or there is a tuple in Reserves showing that sailor I has reserved it. I, N, T , A | I, N, T , A Sailors B, BN, C B, BN, C Boats Ir, Br, D Ir, Br, D Re serves I Ir Br B 21 Algebra vs. Calculus Two formal query languages are equal in the power of expressivness? Algebra Calculus? YES! Calculus Algebra? • Unsafe query {S | (S Sailors)} Correct??? • Safe TRC --- Dom(Q , I) - answers for Q contains only values that are in Dom(Q, I) - R(p(R)), tuple r contains only constant in Dom(Q,I) - R(p(R)), tuple r contains a constant that is not in Dom(Q,I), then p(r) is true. 22 Summary Algebra and safe calculus have same expressive power Relationally complete --if a query language can express all the queries that can be expressed in relational algebra. Relational calculus is non-operational, and users define queries in terms of what they want, not in terms of how to compute it. (Declarativeness.) 23