Chapter Four The Relational Model and Normalization CHAPTER 4 THE RELATIONAL MODEL AND NORMALIZATION ANSWERS TO GROUP I QUESTIONS 4.1 What restrictions must be placed on a table for it to be considered a relation? Rows contain data about an entity Columns contain data about attributes of the entity Cells of the table hold a single value All entries in a column are of the same kind Each column has a unique name The order of the columns in unimportant The order of the rows in unimportant No two rows may be identical 4.2 Define the following terms: relation, tuple, attribute, file, record, field, table, row, column. A relation is a two-dimensional table that has the characteristics listed in questions 4.1 above. A tuple is one row of a table. An attribute is one column in a table. A file is a term use by many people to a table. A record is the same as row of a table A field is the same as a column of a table. A table is a relation. Generally these terms are interchangeable. A row contains all data about one instance of an entity represented in a table. A column contains all values assigned to an attribute in a table. 4.3 Define functional dependency. Give an example of two attributes that have a functional dependency and give an example of two attributes that do not have a functional dependency. A functional dependency is a relationship between attributes such that given the value of one attribute it is possible to determine the value of the other attribute. Example of functional dependency: Name Phone#. Example of attributes that are not functionally dependent: Age and Address. 4.4 If SID functionally determines Activity, does this mean that only one value of SID can exist in the relation? Why or why not? No, there will be many different values of SID in a relation. Probably, but not necessarily, each different value of SID will determine a different value for Activity. Also, a determinant (such as SID) is not necessarily unique within a relation. However, a particular value of SID will have only one corresponding value of Activity, no matter how many times the value of the SID appears. 4.5 Define determinant. A determinant is an attribute whose value enables us to obtain the value(s) of other related attributes. It appears on the left side of a functional dependency. Thus, in SID Student_Name, the determinant is SID. 4-1 Chapter Four The Relational Model and Normalization 4.6 Give an example of a relation having a functional dependency in which the determinant has two or more attributes. Assume a relation contains all football players in a game. There are two teams and some of the player’s numbers occur on both teams. If this is the case, in order to determine a player’s name you would need to also know the name of the player’s team. Team_Name, Player_Number Player_Name 4.7 Define key. A key is one or more columns of a relation that identifies a row in a table. A key can be unique or non-unique. Primary Keys must be unique. 4.8 If SID is a key of a relation, is it a determinant? Can a given value of SID occur more than once in the relation? Yes, SID is a determinant. A value of SID can occur in a relation more once if it is a non-unique key. If it is considered the primary key, the value cannot occur in a relation more than once. A primary key must be unique. 4.9 What is a deletion anomaly? Give an example other than one in this text. A deletion anomaly is a case where the deletion the facts about one entity instance we inadvertently delete facts about another entity instance. With one deletion, we lose facts about two entities. 4.10 What is an insertion anomaly? Give an example other than one in this text. A n insertion anomaly is a case where we cannot insert a fact about one entity until we have an additional fact about another entity. 4.11 Explain the relationship of first, second, third, Boyce–Codd, fourth, fifth, and domain/key normal forms. In order to be in a higher normal from you must also be in all lower normal forms. That is, a relation in second normal form is also in first normal form, and a relation in 5NF (fifth normal form) is also in 4NF, BCNF, 3NF, 2NF, and 1NF. 4.12 Define second normal form. Give an example of a relation in 1NF, but not in 2NF. Transform the relation into relations in 2NF. A relation is in second normal form if all its non-key attributes are dependent on all of the key. A table that is in second normal form cannot have a partial dependency. Assume: COURSE (Dept_Code, Course_Number, Course_Title, Credits, Department_Name) The table is in first normal form because it is flat (has no repeating attributes) Since the key is composite, it is not in second normal form if the Dept_Code Department_Name. Only part of the primary key determines an attribute. To put the table in second normal form use the following tables. COURSE (Dept_Code, Course_Number, Course_Title, Credits) DEPARTMENT (Dept_Code, Department_Name) 4.13 Define third normal form. Give an example of a relation in 2NF, but not in 3NF. Transform the relation into relations in 3NF. 4-2 Chapter Four The Relational Model and Normalization A relation is in third normal form if it is in second normal form and has no transitive dependencies. All determinants must be keys. Assume: STUDENT (SIDU, Stu_Name, P_O_Box, Major, Hours_Required) The table is in first normal form because it is flat (has no repeating attributes). The table is in second normal form because it has an atomic (single attribute) key. If you assume Major Hours_Required them Major is a determinate but not the key therefore you have a transitive dependency. To put the table in second normal form use the following tables. STUDENT (SID, Stu_Name, P_O_Box, Major) DEPARTMENT (Major, Hours_Required) 4.14 Define BCNF. Give an example of a relation in 3NF, but not in BCNF. Transform the relation into relations in BCNF. A relation is in BCNF if every determinant is a candidate key. Assume the following relation. REGISTERED_AUTO (State, License_Number, Issuing_Agency) The key to the relation is the composite of State and License_Number. (a license number is not unique by itself because the same number is issued in several states). If you assume Issuing_Agency is the location where the auto license is issued the License_Number and Issuing_Agency can be a candidate key (an agencies are unique within a state but not between states). 4.15 Define multi-value dependency. Give an example. In general, a multi-value dependency exists when a relation has at least three attributes, two of them are multi-value, and their values depend on only the third attribute. One case of a multivalued dependency would be where a State has multiple Senators. Assume the State also has multiple Representatives. This gives you the case of State Senator and StateRepresentative. If this is a single relation then: CONGRESS (State, Senator, Representative) 4.16 Why are multi-value dependencies not a problem in relations with only two attributes? If the table only has two attributes then you cannot have a case where A B and A C since there is no C. 4.17 Define fourth normal form. Give an example of a relation in BCNF, but not in 4NF. Transform the relation into relations in 4NF. A relation is in fourth normal form if it is in BCNF and has no multi-value dependencies. The following table is not in fourth normal form because State Senator and State Representative. CONGRESS (State, Senator, Representative) In place the table in fourth normal form you would use. STATE_SENATOR (State, Senator) STATE_REPRESENATIVE (State, Representative) 4.18 Define domain/key normal form. Why is it important? 4-3 Chapter Four The Relational Model and Normalization A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of keys and domains. Informally, a relation is in DK/NF if enforcing key and domain restrictions causes all of the constraints to be met. DK/NF is important because a relation in DK/NF has no modification anomalies and a relation having no modification anomalies must be in DK/NF. This establishes a bound on the definition of normal forms, so no higher normal form is needed—at least in order to eliminate modification anomalies. Equally important, DK/NF involves only the concepts of key and domain—concepts that are fundamental and close to the heart of database practitioners. They are readily supported by DBMS products (or could be, at least). 4.19 Transform the following relation into DK/NF. Make and state the appropriate assumptions about functional dependencies and domains. EQUIPMENT (Manufacturer, Model, AcquisitionDate, BuyerName, BuyerPhone, PlantLocation, City, State, ZIP) Assumptions: BuyerName BuyerPhone, PlantLocation, City, State, Zip Zip City, State (Manufacturer, Model, BuyerName) AcqDate Relations: BUYER (BuyerName, BuyerPhone, PlantLocation, Zip) PURCHASE (Manufacturer, Model, BuyerName, AcqDate) ZIP (Zip, City, State) 4.20 Transform the following relation into DK/NF. Make and state the appropriate assumptions about functional dependencies and domains. INVOICE (Number, CustomerName, CustomerNumber, CustomerAddress, ItemNumber, ItemPrice, ItemQuantity, SalespersonNumber, SalespersonName, Subtotal, Tax, TotalDue) Assumptions: Number CustomerNumber, ItemNumber, ItemQuantity, SalespersonNumber, SubTotal, Tax, TotalDue ItemNumber ItemPrice CustomerNumber CustomerAddress SalespersonNumber SalespersonName Relations: INVOICE (Number, CustomerNumber, ItemNumber, ItemQuantity, SalespersonNumber, SubTotal, Tax, TotalDue) ITEM (ItemNumber, ItemPrice) CUSTOMER (CustomerNumber, CustomerAddress) SALESPERSON (SalespersonNumber, SalespersonName) 4.21 Answer question 4.20 again, but this time add attribute CustomerTaxStatus (0 if nonexempt, 1 if exempt). Also, add the constraint that there will be no tax if CustomerTaxStatus = 1. CUSTOMER becomes two relations:. EX-CUSTOMER (CustomerNumber, CustomerName, CustomerAddress, CustomerTaxStatus). 4-4 Chapter Four The Relational Model and Normalization Constraint: CustomerTaxStatus = 1. NOT-EX-CUSTOMER (CustomerNumber, CustomerName, CustomerAddress, CustomerTaxStatus). Constraint: Customer-tax-status = 0. 4.22 Give an example, other than one in this text, in which you would judge normalization to be not worthwhile. Show the relations and justify your design. AUTO (VIN, Make, Model, Color, Year) This table is not in third normal form because Model Make (all Mustangs are Fords, all Impalas are Chevrolets). I would not normalize this table because 1) this is the way most people think of an auto and 2) creating a new table with model and make will save very little redundancy and will create an inter-relation constraint that is probably not necessary. 4.23 Explain two situations in which database designers might intentionally choose to create data duplication. What is the risk of such designs? In particular, when a normalized design is unnatural, awkward, or results in unacceptable performance, a de-normalized design is better. In the table in Question 4-22 above, creating a table for Make and Model would be unnatural for the user and would create a performance degradation on the database. It would take two reads from two different tables to get the auto data rather than one read if it is stored as a single table. De-normalized designs contain duplicated data. This creates a potential problem with data integrity if all copies of the duplicated data are not updated at the same time. ANSWERS TO GROUP II QUESTIONS 4.24 Consider the following relation definition and sample data: PROJECT (ProjectID, EmployeeName, EmployeeSalary) Where ProjectID is the name of a work project EmployeeName is the name of an employee who works on that project EmployeeSalary is the salary of the employee whose name is EmployeeName Assuming that all of the functional dependencies and constraints are apparent in this data, which of the following statements is true? A. ProjectID EmployeeName FALSE: There are multiple Employee Names fro each project. (see project 100A) B. ProjectID EmployeeSalary FALSE: There are multiple Employee Salaries for each project. (see project 100A) C. (ProjectID, EmployeeName) EmployeeSalary FALSE: Each employee’s salary is always the same, regardless of the project. (see Smith) D. EmployeeName EmployeeSalary 4-5 Chapter Four The Relational Model and Normalization TRUE: An Employee Name always has the same Salary. (see Smith) E. EmployeeSalary ProjectID FALSE: There are multiple ProjectIDs for a given Salary. (see Salary 51K) F. EmployeeSalary (ProjectID, EmployeeName) FALSE: There are multiple ProjectID and EmployeeName combinations for a given Salary (see Salary 51K) Answer these questions: G. What is the key of PROJECT? ProjectID and EmployeeName (Composite Key) H. Are all non-key attributes (if any) dependent on all of the key? No: EmployeeSalary is the non-key attribute and it is dependent on the EmployeeName only. I. In what normal form is PROJECT? Project is in first normal form because it has no multi-valued attributes. It is not in second normal form because it has a partial dependency. Key: ProjectID+EmployeeName but EmployeeName EmployeeSalary. J. Describe two modification anomalies from which PROJECT suffers. Insertion Anomaly: You cannot add an Employee until the Employee is assigned to a Project. Likewise, you cannot add a Project until and Employee is assigned to the Project. Update Anomaly: If you want to change Smith’s Salary you will need to change three rows of data in order to change one Employee’s salary. Deletion Anomaly: If Parks did not work on Project 200C and worked in Project 200D only, deletion of ProjectC would delete the fact that Park’s salary was 28K. K. Is ProjectID a determinant? No L. Is EmployeeName a determinant? YES: EmployeeName EmployeeSalary M. Is (ProjectID, EmployeeName) a determinant? No N. Is EmployeeSalary a determinant? No: In this particular case it appears that it could a determinate because no two people have the same salary. Using logic however, one would assume that there is no business rule in a firm that says two people cannot have the same salary. O. Does this relation contain a partial dependency? If so, what is it? 4-6 Chapter Four The Relational Model and Normalization YES, the relation does contain a partial dependency. The key is ProjectID+EmployeeName but EmployeeName EmployeeSalary. P. Redesign this relation to eliminate the modification anomalies. PROJECT (ProjectID, EmployeeName) EMPLOYEE (EmployeeName, EmployeeSalary) 4.25 Consider the following relation definition and sample data: PROJECT-HOURS (EmployeeName, ProjectID, TaskID, Phone, TotalHours) Where EmployeeName is the name of an employee ProjectID is the name of a project TaskID is the name standard work task Phone is the employee’s telephone number TotalHours is the hours worked by the employee on this project Assuming that all of the functional dependencies and constraints are apparent in this data, which of the following statements is true? A. EmployeeName ProjectID NO: There are multiple ProjectIDs for each EmployeeName B. EmployeeName ProjectID YES: Each EmployeeName has three or more ProjectIDs C. EmployeeName TaskID NO: There are multiple TaskIDs for each EmployeeName D. EmployeeName TaskID YES: Don has multiple (2) TaskIDs E. EmployeeName : Phone YES: Each EmployeeName has exactly one Phone value F. EmployeeName : TotalHours YES: Each EmployeeName has exactly one TotalHours value G. (EmployeeName, ProjectID) TotalHours YES: This is true based upon the fourth assumption stated above. Looking at the data only, it is more probable that EmployeeName TotalHours. This is because the TotalHours is the same for an EmployeeName regardless of the ProjectID. H. (EmployeeName, Phone ) TaskID NO: There are multiple TaskIDs for a given EmployeeName, Phone combination I. ProjectID TaskID 4-7 Chapter Four The Relational Model and Normalization NO: There are multiple TaskIDs for a given ProjectID J. TaskID ProjectID NO: There are multiple ProjectIDs for a given TaskID Answer these questions: K. What are all of the determinants? EmployeeName Phone EmployeeName TaskID EmployeeName, ProjectID TotalHours L. Does this relation contain a partial dependency? If so, what is it? YES, it does contain a partial dependency. Since the key is EmployeeName + ProjectID + TaskID but EmployeeName Phone, there is a partial dependency. M. Does this relation contain a multi-value dependency? If so, what are the unrelated attributes? YES: The related attributes are ProjectID and TaskID. EmployeeName ProjectID and EmployeeName TaskID N. What is the deletion anomaly that this relation contains? If Employee Don no longer has the TaskID B-1 two rows must be deleted, Row 1 and Row 3. The deletion of one fact requires deletion or two rows. O. How many themes does this relation have? It would appear that there are at least three themes. 1) Employees and their Tasks 2) Employees and their Phone Numbers and 3) Employees and hours worked on a project. P. Redesign this relation to eliminate the modification anomalies. How many relations did you use? How many themes does each of your new relations contain? EMPLOYEE (EmployeeName, Phone) EMPLOYEE_TASKS (EmployeeName, TaskID) PROJECT-HOURS (EmployeeName, ProjectID, TotalHours) Three relations are required, one for each theme. Each relation now carries one theme. 4.26 Consider the following domain, relation, and key definitions: Domain Definitions EmployeeName in Names values CHAR(20) PhoneNumber in Phones values DEC(5) EquipmentName in ENames values CHAR(10) Location in Places values CHAR(7) Cost in Money values CURRENCY 4-8 Chapter Four The Relational Model and Normalization Date in Dates values YYMMDD Time in Times values HHMM where HH between 00 and 23 and MM between 00 and 59 Definitions of Relation, Key, and Constraint EMPLOYEE (EmployeeName, PhoneNumber) Key: EmployeeName Constraints: EmployeeName PhoneNumber EQUIPMENT (EquipmentName, Location, Cost) Key: EquipmentName Constraints: EquipmentName Location EquipmentName Cost APPOINTMENT (Date, Time, EquipmentName, EmployeeName) Key: (Date, Time, EquipmentName) Constraints: (Date, Time, EquipmentName) EmployeeName A. Modify the definitions to enforce this constraint: An employee may not sign up for more than one equipment appointment. Change key of APPOINTMENT to EmployeeName APPOINTMENT (Date, Time, EquipmentName, EmployeeName) Key: (EmployeeName) Constraints: EmployeeName (Date, Time, EquipmentName) B. Define nighttime to refer to the hours between 2100 and 0500. Add an attribute Employee Type whose value is 1 if the employee works during nighttime. Change this design to enforce the constraint that only employees who work at night can schedule nighttime appointments. Add the following domain definitions: NightTime in Times values HHMM where HH between 21 and 04 and MM between 00 and 59 EmployeeType in Types Values DEC(1) NightEmp Types Values DEC(1) where EmployeeType=1 in Add EmployeeType to Employee Relation; EMPLOYEE (EmployeeName, PhoneNumber, EmployeeType) Key: EmployeeName Constraints: EmployeeName PhoneNumber Replace the APPOINTMENT relation by DAY-APPT (Date, DayTime, EquipmentName, EmployeeName) NIGHT-APPT (Date, NightTime, EquipmentName, NightEmp) ANSWERS TO FIREDUP PROJECT QUESTIONS FiredUp hired a team of database designers (who should have been fired!) to create the following relations for a database to keep track of their stove, repair, and customer data. See 4-9 Chapter Four The Relational Model and Normalization the projects at the end of Chapters 1 through 3 to review their needs. For each of the following relations, specify candidate keys, functional dependencies, and multi-valued dependencies (if any). Justify these specifications unless they are obvious. Given your specifications about keys and so on, what normal form does each relation have? Transform each relation into two or more relations that are in domain/key normal form. Indicate the primary key of each table, candidate keys, foreign keys; and specify any referential integrity constraints. In answering these questions, assume the following: Stove type and version determine tank capacity. A stove can be repaired many times, but never more than once on a given day. Each stove repair has its own repair invoice. A stove can be registered to different users, but never at the same time. A stove has many component parts and each component part can be used on many stoves. Thus, FiredUp maintains records about part types, such as burner valve, and not about particular parts such as burner valve number 41734 manufactured on 12 December 2003. A. PRODUCT1 (SerialNumber, Type, VersionNumber, TankCapacity, DateOfManufacture, InspectorInitials) Candidate Keys: SerialNumber Functional Dependencies: SerialNumber all other attributes (Type, VersionNumber) TankCapacity Multi-valued dependencies: None. Normal form: 2NF but not 3nd because of the transitive dependency: SerialNumber (Type, VersionNumber) TankCapacity Domain/Key Normal Form Relations: PRODUCT (SerialNumber, Type, VersionNumber, DateOfManufacture, InspectorInitials) STOVE-TYPE(Type, VersionNumber, TankCapacity) No candidate keys. Referential integrity constraint: (Type, VersionNumber) in PRODUCT must exist in (Type, Version Number) in STOVE-TYPE B. PRODUCT2 (SerialNumber, Type, TankCapacity, RepairDate, RepairInvoiceNumber, RepairCost) Candidate Keys: (RepairInvoiceNumber) 4-10 Chapter Four The Relational Model and Normalization (SerialNumber, RepairDate) Functional Dependencies: RepairInvoiceNumber all other attributes (SerialNumber, RepairDate) all other attributes SerialNumber (Type, TankCapacity) (note, without VersionNumber, Type does not determine TankCapacity) Multi-valued dependencies: None. Normal form: In 2NF but not 2NF SerialNumber is a determinate but not the key. Domain/Key Normal Form Relations: STOVE(SerialNumber, Type, TankCapacity) STOVE-REPAIR (RepairInvoiceNumber, RepairDate, RepairCost, SerialNumber) Referential integrity constraint: (RepairDate, SerialNumber) is a candidate key SerialNumber in STOVE-REPAIR must be in SerialNumber of STOVE C. REPAIR1 (RepairInvoiceNumber, RepairDate, RepairCost, RepairEmployeeName, RepairEmployeePhone) Candidate Keys: RepairInvoiceNumber Functional Dependencies: RepairInvoiceNumber all other attributes RepairEmployeeName RepairEmployeePhone Multi-valued dependencies: None Normal form: 2NF, but not 3NF because of transitive dependency: RepairInvoiceNumber RepairEmployeeName RepairEmployeePhone Domain/Key Normal Form Relations: REPAIR(RepairInvoiceNumber, RepairDate, RepairCost, RepairEmployeeName) EMP-PHONE (RepairEmployeeName, RepairEmployeePhone) No candidate keys. Referential integrity constraint: RepairEmployeeName in REPAIR must exist in RepairEmployeeName in EMP-PHONE. 4-11 Chapter Four The Relational Model and Normalization D. REPAIR2 (RepairInvoiceNumber, RepairDate, RepairCost, RepairEmployeeName, RepairEmployeePhone, SerialNumber, Type, TankCapacity) Candidate Keys: RepairInvoiceNumber (RepairDate, SerialNumber) Functional Dependencies: RepairInvoiceNumber determines all other attributes (RepairDate, SerialNumber) determines all other attributes SerialNumber Type, TankCapacity RepairEmployeeName RepairEmployeePhone Multi-valued dependencies: None. Normal form: 2NF, but not 3NF because Key is RepairInvoiceNumber but epairEmployeeName RepairEmployeePhone (Transitive dependency) Domain/Key Normal Form Relations: REPAIR (RepairInvoiceNumber, RepairDate, RepairCost, RepairEmployeeName, SerialNumber) EMP-PHONE (RepairEmployeeName, RepairEmployeePhone) STOVE (SerialNumber, Type, TankCapacity) Candidate key: in REPAIR (RepairDate, SerialNumber) RepairEmployeeName in REPAIR must exist in RepairEmployeeName in EMP PHONE SerialNumber in REPAIR must exist in SerialNumber in STOVE E. REPAIR3 (RepairDate, RepairCost, SerialNumber, DateOfManufacture) Candidate Keys: (RepairDate, SerialNumber) Functional Dependencies: (RepairDate, SerialNumber) determines all other attributes SerialNumber DateOfManufacture Multi-valued dependencies: None. Normal form: 1NF but not 2NF because DateOfManufacture is not dependent on all of the key (RepairDate, SerialNumber) 4-12 Chapter Four The Relational Model and Normalization Domain/Key Normal Form Relations: REPAIR (RepairDate, SerialNumber, RepairCost) STOVE (SerialNumber, DateOfManufacture) No candidate keys. SerialNumber in REPAIR must exist in SerialNumber in STOVE. F. STOVE1 (SerialNumber, RepairInvoiceNumber, ComponentPartNumber) Candidate Keys: (RepairInvoiceNumber, ComponentPartNumber) Functional Dependencies: RepairInvoiceNumber SerialNumber Multi-valued dependencies: RepairInvoiceNumber ComponentPartNumber (noting also that it SerialNumber) Normal form: 3NF but not 4NF Domain/Key Normal Form Relations: REPAIR (SerialNumber, RepairInvoiceNumber) REPAIR-PART (RepairInvoiceNumber, ComponentPartNumber) No candidate keys. RepairInvoiceNumber in REPAIR-PART must exist in RepairInvoiceNumber in REPAIR. G. STOVE2 (SerialNumber, RepairInvoiceNumber, RegisteredOwnerID) Assume there is a need to record the owner of a stove, even if it has never been repaired. This example shows a good application for domain key/normal form. The assumption that every stove has a RegisteredOwnerID means that every stove will have at least one row in STOVE2. That row will have a value for SerialNumber and RegisteredOwnerID, and RepairInvoiceNumber will be null. Furthermore, because a stove may be registered to more than one owner, SerialNumber cannot determine RegisteredOwnerID. Also, because an owner may own more than one stove, RegisteredOwnerID cannot determine SerialNumber. So both SerialNumber and RegisteredOwnerID have to be in the key. However, for stoves that have been in for repair, there will be multiple rows for a given (SerialNumber, RegisteredOwnerID), so the key has to be (SerialNumber, RepairInvoiceNumber, RegisteredOwnerID). But now we have the following constraint: If RepairInvoiceNumber is not null, then 4-13 Chapter Four The Relational Model and Normalization RepairInvoiceNumber SerialNumber (If RepairInvoiceNumber is null, then this is not true.) This is a constraint within the Fagin’s definition of constraint within domain key/normal form, but it is not directly discussed in any of the 5 normal forms. Thus, its current normal form is unclassifiable, but we know it is not DK/NF because of the RepairInvoiceNumber constraint that is not implied by the key definition. Hence, it will have modification anomalies. To construct dk/nf relations, split into two as follows: REPAIR (SerialNumber, RepairInvoiceNumber) STOVE-OWNER (SerialNumber, RegisteredOwnerID) There are no candidate keys, and there is the referential integrity constraint: SerialNumber in REPAIR must exist in SerialNumber in STOVE-OWNER H. Given the assumptions of this case, the relations and attributes in items A–G and your knowledge of small business, construct a set of domain/key relations for FiredUp. Indicate primary keys, foreign keys, and referential integrity constraints. CAPACITY(Type, Version, TankCapacity) OWNER (OwnerID, OwnerName) PART (PartID, PartDescription, PartCost) STOVE (SerialNumber, Type, Version, ManufacturerDate,StoveOwner)) Foreign Key (Type,Version) references CAPACITY Foreign Key (StoveOwner) references OWNER Constraints: Type,Version in STOVE must exist in Type,Version in CAPACITY StoveOwner in STOVE must exist in StoveOwner in OWNER REPAIRPERSON (RepairPersonName, RepairPersonName) REPAIR (RepairInvoiceNumber, SerialNumber, RepairDate, RepairCost, RepairPersonName) Foreign Key (SerialNumber) references STOVE Foreign Key (RepairPersonName) references REPAIRPERSON Constraints: SerialNumber in REPAIR must exist in SerialNumber in STOVE RepairPersonName in REPAIR must exist in RepairPersonName in REPAIRPERSON PART_REPAIR_INT (Part_Id, RepairInvoiceNumber, NumberUsed) Foreign Key (Part_Id) references PART Foreign Key (RepairInvoiceNumber) references REPAIR Constraints: Part_Id in PART_REPAIR_INT must exist in Part_Id in PART RepairInvoiceNumber in PART_REPAIR_INT must exist in RepairInvoiceNumber in REPAIR 4-14 Chapter Four The Relational Model and Normalization ANSWERS TO TWIGS TREE TRIMMING SERVICE Samantha hired a team of database designers who created the following relations for a database to keep track of her customer, service, chip, and related data. See the projects at the end of Chapters 1 through 3 to review her needs. For each of the following relations, specify candidate keys, functional dependencies, and multi-valued dependencies (if any). Justify these specifications unless they are obvious. Given your specifications about keys, etc., what normal form does each relation have? Transform each relation into two or more relations that are in domain/key normal form. Indicate the primary key of each table, candidate keys, foreign keys; and specify any referential integrity constraints. In answering these questions, assume the following: Customers can request multiple services, but only one service on a given day. Samantha creates one invoice for all services performed for a customer on a given day, but customers sometimes make partial payments on a given invoice. A given tree species can receive multiple types of service and is also susceptible to multiple diseases. A customer owns only one property Customers do move, but they leave their trees behind. When they move, they may or may not change their phone numbers. Either way, Samantha wants to continue to track the customers. Customers can have multiple services and multiple chip deliveries. A. CUSTOMER1 (Name, Phone, Street, City, State, Zip, AppointmentDate, ServiceRequested) Candidate Keys: (Name, AppointmentDate) Functional Dependencies: (Name, AppointmentDate) all other attributes Name Phone, Zip Zip Street, City, State Multi-valued dependencies: None. Normal form: 2NF but not 3nd because of the transitive dependencies: Name Phone, Zip Zip Street, City, State Domain/Key Normal Form Relations: APPOINTMENT (AppointmentDate, ServiceRequested, Name) CUSTOMER(Name, Phone, Zip) ZIPCODE (Zip, Street, City, State) 4-15 Chapter Four The Relational Model and Normalization Candidate keys Phone of CUSTOMER Referential integrity constraint: Name in APPOINTMENT must exist in Name in CUSTOMER Zip in CUSTOMER must exist in Zip in ZIPCODE B. CUSTOMER2 (Name, Phone, Street, City, State, Zip, AppointmentDate, ServiceRequested, InvoiceNumber, AmountBilled, DatePaid, AmountPaid) Candidate Keys: InvoiceNumber, DatePaid Functional Dependencies: InvoiceNumber AppointmentDate, ServiceRequested, AmountBilled Name Phone, Zip Zip Street, City, State Multi-valued dependencies: InvoiceNumber DatePaid, AmountPaid Normal form: 1NF but not 2nd because of the Partial dependency: InvoiceNumber AppointmentDate, ServiceRequested, AmountBilled Domain/Key Normal Form Relations: INVOICE (InvoiceNumber, AppointmentDate, ServiceRequested, AmountBilled, Name) CUSTOMER (Name, Phone, Zip) ZIPCODE (Zip, Street, City, State PAYMENT (InvoiceNumber, DatePaid, AmountPaid) Candidate keys Phone of CUSTOMER Referential integrity constraint: Name in INVOICE must exist in Name in CUSTOMER Zip in CUSTOMER must exist in Zip in ZIPCODE InvoiceNumber in PAYMENT must exist in InvoiceNumber in INVOICE C. TREE (Customer, Street, City, State, Zip, LocationOnProperty, Species, ApproxAge, ServiceDate, ServiceDescription) Candidate Keys: (Customer, ServiceDate) Functional Dependencies: (Customer, ServiceDate) all attributes (Customer, LocationOnProperty) Species, ApproAge Customer Phone, Zip 4-16 Chapter Four The Relational Model and Normalization Zip Street, City, State Multi-valued dependencies: None Normal form: 2NF but not 3nd because of the transitive dependency: LocationOnProperty Zip Street, City, State Domain/Key Normal Form Relations: TREE (CustName, LocationOnProperty, Species, ApproxAge) SERVICE (CustName, AppointmentDate, ServiceRequested) CUSTOMER (CustName, Phone, Zip) ZIPCODE (Zip, Street, City, State) Candidate keys Phone of CUSTOMER Referential integrity constraint: CustName in TREE must exist in Name in CUSTOMER CustName in SERVICE must exist in Name in CUSTOMER Zip in CUSTOMER must exist in Zip in ZIPCODE D. SPECIES (SpeciesName, Disease, ServiceType) Assumption: There is no relationship between Disease and ServiceType Candidate Keys: (SpeciesName, Disease, ServiceType) Functional Dependencies: Multi-valued dependencies: SpeciesName Disease SpeciesName ServiceType Normal form: 3NF because it has no dependent attributes but not 4NF. It has two multi-valued dependencies Domain/Key Normal Form Relations: SPECIESDISEASE(SpeciesName, Disease) SPECIESSERVICE(SpeciesName, ServiceType) No candidate keys. No referential integrity constraints: E. RECURRING_SERVICE (CustomerName, Phone, ServiceDescription, ServiceInterval, LastServiceDate) Candidate Keys: (CustomerName, ServiceDescription, LastServiceDate) 4-17 Chapter Four The Relational Model and Normalization Functional Dependencies: (CustomerName, ServiceDescription, LastServiceDate) all attributes CustomerName Phone ServiceDescription ServiceInterval Multi-valued dependencies: None Normal form: 2NF but not 3nd because of the transitive dependency: (CustomerName, ServiceDescription, LastServiceDate) CustomerName Phone Domain/Key Normal Form Relations: RECURRINGSERVICE(CustomerName, ServiceDescription, LastServiceDate) CUSTOMER(CustomerName, Phone) SERVICE(ServiceDescription, ServiceInterval) Candidate keys Phone of CUSTOMER Referential integrity constraint: ServiceDescription in RECURRINGSERVICE must exist in ServiceDescription in SERVICE CustomerName in RECURRINGSERVICE must exist in CustomerName in CUSTOMER F. CHIP_DELIVERY_REQUEST (CustomerName, Phone, DateOfRequest, DateOfDelivery) Candidate Keys: CustomerName, DateOfRequest CustomerName, DateOfDelivery Functional Dependencies: CustomerName, DateOfRequest all attributes CustomerName Phone Multi-valued dependencies: None Normal form: 1NF but not 2nd because of the partial dependency: CustomerName Phone Domain/Key Normal Form Relations: CHIP_DELIVERY_REQUEST (CustomerName, DateOfRequest, DateOfDelivery) CUSTOMER (CustomerName, Phone) Candidate keys Phone of CUSTOMER 4-18 Chapter Four The Relational Model and Normalization Referential integrity constraint: CustomerName in CHIP_DELIVERY_REQUEST must exist in Name in CUSTOMER 4-19