1. A clustering index is used for numerous data pieces that have the same value for the ordering field. A primary index has records that are the same fixed length; it always has the same first field called the primary key; there is one index entry for every block of data. - Clustering is used when there is not a distinct size like the primary key. It has two fields in the ordered file; a disk pointer and a clustering index. - A secondary index is only used when there is already some way to primarily access the data. Its also an ordered file with two fields. Several secondary indexes can exist in one data set. 2. 0. Fd: propertyid -> country name, lot#, area, price, tax_rate - Area is common 1. Countryname, lot# -> propertyid, area, price, tax_rate 2. Countryname -> tax_rate 3.Area -> price ● Lossless join have at least 1 key relation ● LOTS1A and LOTS1B has common attribute 'Area' , but 'Area' is super key in LOTS1B and LOTS1A and LOTS2 has common attribute 'Country_name', it is superkey in LOTS2 , hence it is lossless join decomposition . 3. A set of inferences that is complete means that other logical conclusions cna come from it. The soundness means that you cannot derive dependencies that are not sound/ that do not hold true. “By sound, we mean that given a set of functional dependencies F specified on a relation schema R, any dependency that we can infer from F by using IR1 through IR3 holds in every relation state r of R that satisfies the dependencies in F. By complete, we mean that using IR1 through IR3 repeatedly to infer dependencies until no more dependencies can be inferred results in the complete set of all possible dependencies that can be inferred from F.” 4. ABD is a candidate key because there is no subset of the attributes that is key. Ab is not a candidate key. 5. All the attributes are atomic so it is in 1nf currently. It isnt 2nf because there is a dependency between key and non key (in this case salesman and commission). The is not a 3nf relation because theres dependencies that are transiitive. 2nf: CAR_SALE1(Car#, Salesman#, DateSold, DiscountAmount) CAR_SALE2(Salesman#, Commission%) 3nf: CAR_SALES1A(Car#, Salesman#, DateSold) CAR_SALES1B(DateSold, DiscountAmount) CAR_SALE3(Salesman#, Commission%) **6. Find left and right redundancies; find extraneous attributes there is no redundancies so: minimal cover is E ={P QRS, RS T} 7a. It is easy to insert and retrieve ; can take time and cause blanks 7b. More efficient because of order ; difficult to rearrange 7c. Good with lots of data because it is fast; there is a fixed number of buckets which is an issue if the system gets bigger or smaller - Hashing is the most difficult and expensive to impelement 8. There is the functional dependency of tripid and start date - Cities and cards visited can be used multiple times - The MVDS are: - (Start_date --> Cities_visited | Cards_used) (Trip_id --> Cities_visited | Cards_used) - There are no independencies in the relation so its 4nf ] 9a. M doesnt determine y and p - M Y does - M C doesnt determine y or p 9b. Not in bcnf or 3nf because the two FD m and mp are not superkeys, it is niot 2nf either so it is 1nf . It is lossless