HW 2 Solutions Exercise 3.3.1 i) Indicate all the BCNF violations. Don’t forget FD’s that follow from the set of given FD’s. ii) Decompose the relations into collections of relations that are in BCNF. a) R(A,B,C,D) with FD’s AB->C, C->D, and D->A What are the non-trivial FD’s? AB->C C->A C->D DB->C D->A AB->D Following algorithm 3.20 1. Check whether R is in BCNF. Relation R is in BCNF iff whenever there is a non trivial FD A1 A2 … An -> B1 B2 … Bn for R { A1 A2 … An } is a superkey. AB->C and AB->D: The set closure of AB+ is { A, B, C, D } therefore the left hand side of this FD is a superkey. C->D and C->A: The set closure of C+ is { C, D, A } therefore the left hand side is not a superkey. BCNF Violation D-> A: C->D: The set closure of D+ is { D, A } therefore the left hand side is not a superkey. BCNF Violation DB->C: The set closure of DB+ is { D, B, C, A } therefore the left hand side of this FD is a superkey. 2. If there are BCNF violations, let one be X->Y. Use Algorithm 3.7 to compute X+. Choose R1 = X+ as one relation schema and let R2 have attributes X and those attributes of R that are not in X+. The BCNF violation we will use is C->D. C+ is { C, D, A } R1 = { C, D, A } and R2 = { C, B } 3. Use Algorithm 3.12 to compute the sets of FD’s for R1 and R2; let these be S1 and S2, respectively. S1 = { C->D, C->A, D->A } and S2 = { empty } 4. Recursively decompose R1 and R2 using this algorithm. Return the union of the results of these decompositions. R2 is in BCNF because it only has two attributes. Now check R1: C->D, C->A: The set closure of C+ is { C, D, A } D->A: The set closure of D+ is { D, A } therefore the left hand side is not a superkey. BCNF Violation Step 2: R1.1 = { D, A } and R2.1 = { D, C } Recursion ends because each of these sets now contains only two attributes. Return the union of all R sets. Answer: R1.1 = { D, A } R2.1 = { D, C } R2 = { C, B } b) R(A,B,C,D) with FD’s B->C and B->D Following algorithm 3.20 1. Check whether R is in BCNF. Relation R is in BCNF iff whenever there is a non trivial FD A1 A2 … An -> B1 B2 … Bn for R { A1 A2 … An }is a superkey. B->C and B->D: The set closure of B+ is { B, C, D } therefore neither of the left side of these FD’s are superkeys. BCNF Violations 2. If there are BCNF violations, let one be X->Y. Use Algorithm 3.7 to compute X+. Choose R1 = X+ as one relation schema and let R2 have attributes X and those attributes of R that are not in X+. The BCNF violation we will use is B->C. B+ is { B, C, D } R1 = { B, C, D }and R2 = { B, A } 3. Use Algorithm 3.12 to compute the sets of FD’s for R1 and R2; let these be S1 and S2, respectively. S1 = { B->C, B->D } and S2 = { empty } 4. Recursively decompose R1 and R2 using this algorithm. Return the union of the results of these decompositions. R2 is in BCNF because it only has two attributes. Now check R1: B->C, B->D: The set closure of B+ is { B, C, D } therefore the left hand sides of these FD’s are superkeys. Answer: R1 = { B, C, D } R2 = { B, A } Exercise 3.5.4 i) ii) Indicate all the 3NF violations. Decompose the relations, as necessary, into collections of relations that are in 3NF. Relation R is in 3NF if whenever there is a non trivial FD A1 A2 … An -> B1 B2 … Bn for R either { A1 A2 … An }is a superkey, or B is a member of some key. a) R(A,B,C,D) with FD’s AB->C, C->D, and D->A Find all the keys in this relation by finding the set closures of the combinations of attributes. A+ = { A } B+ = { B } C+ = { C,D,A } D+ = { D,A } AB+ = { A,B,C,D } AC+ = { A,C,D } AD+ = { A,D } BC+ = { B,C,D,A } BD+ = { B,D,A,C } CD+ = { C,D,A } Check all the FD’s for a violation of the 3NF condition. AB->C: AB is a superkey, C is prime because it is a member of key BC. C->D: D is prime because it is a member of key BD. D->A: A is prime because it is a member of key AB. Answer: No 3NF violations. No decomposition needed. b) R(A,B,C,D) with FD’s B->C and B->D Find all the keys in this relation by finding the set closures of the combinations of attributes. A+ = { A } B+ = { B,C,D } C+ = { C } D+ = { D } AB+ = { A,B,C,D } AC+ = { A,C } AD+ = { A,D } BC+ = { B,C,D } BD+ = { B,D } CD+ = { C,D } Check all the FD’s for a violation of the 3NF condition. B->C: B is not a superkey, C is not prime. 3NF Violation B->D: B is not a superkey. D is not prime. 3NF Violation Decompose on violation B->C. R1 = { B,C } and R2 = { B,A,D } Now recursively check for 3NF violations on R2 Find all the keys in R2 by find the set of closures of the combinations of attributes that exist in R2. The only FD that exists for R2 is B->D. B+ = { B,D } A+ = { A } D+ = { D } BA+ = { B,A,D } BD+ = { B,D } AD+ = { A,D } Again check all FD’s for a violation of the 3NF condition. In this case only one FD exists. B->D: B is not a superkey and D is not prime. 3NF Violation Decompose into two relations: R2.1 = { B,D } and R2.2 = { B,A } Answer: R1 = { B,C } R2.1 = { B,D } R2.2 = { B,A } Exercise 5.1.1 Let PC be the relation of Fig 2.2.1(a) and suppose we compute the projection speed(PC). What is he value of this expression as a set? 2.66 2.10 1.42 2.80 3.20 2.20 2.00 1.86 3.06 As a bag? 2.66 2.10 1.42 2.80 3.20 3.20 2.20 2.20 2.00 2.80 1.86 2.80 3.06 What is the average value of tuples in this projection, when treated as a set? Answer: 2.37 As a bag? Answer: 2.49 Exercise 5.1.4 Certain algebraic laws for relations as sets also hold for relations as bags. Explain why each of the laws below hold for bags as well as sets. b) The associative law for intersection: (R S) T = R (S T). On the right-hand side: Given bags R and S where a tuple t appears n and m times respectively, the intersection of bags R and S will have tuple t appear min( n, m ) times. The further intersection of bag T with the tuple t appearing o times we will produce t min( o, min( n, m ) ) times. On the left-hand side: Given bags S and T where a tuple t appears m and o times respectively, the intersection of bags R and S will have tuple t appear min( m, o ) times. The further intersection of bag R with the tuple t appearing n times we will produce t min( n, min( m, o ) ) times. Both sides will yield the minimum number of times tuple t exists in all three bags. Bags are essentially sets that allow the appearance of a tuple more than once. In this case the operations on a bag and set yield the same results. d) The commutative law for union: (R S) = (S R). Suppose a tuple t occurs n and m times in sets R and S respectively. The union of these two bags in the bag R S, tuple t would appear n + m times. The union of these two bags in the bag S R, tuple t would appear m + n times. Both sides of the relation yield the same result. In a set a tuple can only appear at most one time. Tuple t might appear in set R and S 1 or 0 times. The combinations of number of occurrences for tuple t in R and S respectively are (0,0), (0,1), (1,0), and (1,1). The union of either one of these combinations on the right or left side of the relation would yield the same result. Exercise 5.2.1 Here are two relations: R(A,B): {(0,1), (2,3), (0,1), (2,4), (3,4) } S(B,C): {(0,1), (2,4), (2,5), (3,4), (0,2), (3,4) } Compute the following: a) A+B, A2, B2 (R) A2 0 4 0 4 9 A+B 1 5 1 6 7 B2 1 9 1 16 16 b) B+1, C-1 (S) B+1 1 3 3 4 1 4 C-1 0 3 4 3 1 3