ITEC 3220A Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: http://people.yorku.ca/~zyang/itec 3220a.htm Office: Tel 3049 Supertypes and Subtypes • Generalization hierarchy: depicts relationships between higher-level supertype and lower-level subtype entities • Supertype: contains the shared attributes • Subtype: contains the unique attributes • Inheritance: – Subtype entities inherit values of all attributes of the supertype – An instance of a subtype is also an instance of the supertype 2 Supertypes and Subtypes (Cont’d) Supertype/ subtype relationships Attributes shared by all entities General entity type SUPERTYPE And so forth SUBTYPE1 Attributes unique to subtype1 SUBTYPE2 Specialized version of supertype Attributes unique to subtype2 3 Supertypes and Subtypes (Cont’d) • Disjoint relationships – Unique subtypes – Non-overlapping – Indicated with a ‘G’ • Overlapping subtypes – An instance of the supertype could be more than one of the subtypes – Indicated with a ‘Gs’ 4 Generalization Hierarchy with Overlapping Subtypes 5 Chapter 5 Logical Database Design and Normalization of Database Tables In this chapter, you will learn: • How to transform ERD into relations • What normalization is and what role it plays in database design • About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF • How normal forms can be transformed from lower normal forms to higher normal forms • Normalization and E-R modeling are used concurrently to produce a good database design • Some situations require denormalization to generate information efficiently 7 Transforming ERD into Relations • Step one: Map regular entities – Each regular entity type in an ER diagram is transformed into a relation. – The name given to the relation is generally the same as the entity type. – Each simple attribute of the entity type becomes an attribute of the relation and the identifier of entity becomes the primary key of the corresponding relation. 8 Example STUDENT Student_ID Student_Name Other_Attributes 9 Transforming ERD into Relations (Cont’d) • Step two: Map weak entities – Create a new relation and include all of the simple attributes as the attributes of this relation. Then include the primary key of the identifying relation as a foreign key attribute in this new relation. The primary key of the new relation is the combination of this primary key of the identifying relation and the partial identifier of the weak entity type. 10 Example Employee_ID Date_of_birth Employee_Name Dependent_Name EMPLOYEE Has DEPENDENT Gender EMPLOYEE Employee_ID Employee_Name DEPENDENT Dependent_Name Employee_ID Date_of_birth Gender 11 Transforming ERD into Relations (Cont’d) • Step three: Map binary relationship – Map Binary one-to-many relations •First create a relation for each of the two entity types participating in the relationship, using the procedure described in step one. •Next, include the primary key attribute of the entity on the one-side of the relationship as a foreign key in the relation that is on the many-side of the relationship. 12 Example Customer_ID Customer_Name Order_ID 1 Order_Date M Customer Submits (0,N) Order (1,1) Customer Customer_ID Customer_Name Order Order_ID Order_Date Customer_ID 13 Transforming ERD into Relations (Cont’d) • Step three: Map binary relationship (Cont’d) – Map binary one-to-one relationships •First, two relations are created one for each of the participating entity types. •Second, the primary key of one of the relations is included as a foreign key in the other relation. 14 Example Nurse_ID Nurse_Name Centre_Name 1 1 In_charge Nurse Location (0,1) (1,1) Nurse Nurse_ID Nurse_Name Care Centre Centre_Name Location Care Centre Nurse_in_c harge 15 Transforming ERD into Relations (Cont’d) • Step Four: Map composite Entities – First step • Create three relations: one for each of the two participating entities, and the third for the composite entity. We refer to the relation formed from the composite entity as the composite relation – Second step • Identifier not assigned: The default primary key for the composite relation consists of the two primary key attributes from the other two relations. • Identifier assigned: The primary key for the composite relation is the identifier. The primary keys for the two participating entity types are included as foreign keys in the composite relation. 16 Example Order_ID Order Order_Date 1 (1,N) (1,1) M Order Line Product_ID M (1,1) 1 (0,N) Product Description Quantity Price 17 Example Order Order_ID Order_Date Order Line Product_ID Order_ID Quantity Description Standard_Price Product Product_ID 18 Example Customer_ID Vendor_ID Name Customer 1 Date M Shipment Shipment_No Customer Customer_ID Vendor_ID M 1 Vendor Amount Customer_Name Shipment Shipment_N Vendor_I o D Vendor Address Address Customer_I Date D Amount 19 Transforming ERD into Relations (Cont’d) • Step Five: Map unary relationship – Map unary one-to-many relationship •The entity type in the unary relationship is mapped to a relation using the procedure described in Step one. Then a foreign key attribute is added within the same relation that references the primary key values. A recursive foreign key is a foreign key in a relation that references the primary key values of that same relation. 20 Example Name Employee_ID Employee Birthdate M (1,1) 1 (0,N) Employee Employee_ID Manages Name Birthdate Manager_ID 21 Transforming ERD into Relations (Cont’d) • Step six: Map ternary relationship – Convert a ternary relationship to a composite entity – To map a composite entity that links three regular entities, we create a new composite relation. The default primary key of their relation consists of the three primary key attributes for the participating entities. Any attributes of the composite entity become attributes of the new relation 22 Example Patient_ID Patient_Name Patient 1 M Results (0,N) (1,1) Physician_ID (0,N) (1,1) M Physician_Name Physician 1 Date Patient Treatment Time Treatment_ Code M (1,1) 1 (0,N) Treatment Description 23 Example Patient Patient_ID Patient_Name Physician Physician_ID Physician_Name Patient Treatment Patient_ Physician_I Treatment_C Date ID D ode Treatment Treatment_Code Time Result Description 24 Transforming ERD into Relations (Cont’d) • Step seven: Map supertype/subtype relationships 1. Create a separate relation for the supertype and for each of its subtypes 2. Assign to the relation created for the supertype the attributes that are common to all members of the supertype, including the primary key 3. Assign to the relation for each subtype the primary key of the supertype, and only those attributes that are unique to that subtype 4. Assign one attribute of the supertype to function as the subtype discriminator 25 Example Address Employee_Name Employee_Number Employee_Type Employee Date_Hired Gs Hourly Employee Hourly_Rate Salaried Employee Annual_Salary Stock_Option 26 Example Employee Employee_Nu Employee_N Address mber ame Hourly_Employee H_Employee_Number Employee_Ty Date_Hi pe red Hourly_Rate Salaried_Employee S_Employee_Number Annual_Salary Stock_Option 27 Database Tables and Normalization • Table is the basic building block in database design • Normalization is the process for assigning attributes to entities – Reduces data redundancies – Helps eliminate data anomalies – Produces controlled redundancies to link tables • Normalization stages – – – – 1NF - First normal form 2NF - Second normal form 3NF - Third normal form 4NF - Fourth normal form 28 Need for Normalization 29 Anomalies In the Table • PRO_NUM intended to be primary key • Table displays data anomalies – Update • Modifying JOB_CLASS – Insertion • New employee must be assigned project – Deletion • If employee deleted, other vital data lost 30 Conversion to First Normal Form • Repeating group – Derives its name from the fact that a group of multiple entries of same type can exist for any single key attribute occurrence • Relational table must not contain repeating groups • Normalizing table structure will reduce data redundancies • Normalization is three-step procedure 31 Conversion to First Normal Form (continued) • Step 1: Eliminate the Repeating Groups – Present data in tabular format, where each cell has single value and there are no repeating groups – Eliminate repeating groups, eliminate nulls by making sure that each repeating group attribute contains an appropriate data value 32 Conversion to First Normal Form (continued) 33 Conversion to First Normal Form (continued) • Step 2: Identify the Primary Key – Primary key must uniquely identify attribute value – New key must be composed 34 Conversion to First Normal Form (continued) • Step 3: Identify all dependencies – Dependencies can be depicted with help of a diagram – Dependency diagram: •Depicts all dependencies found within given table structure •Helpful in getting bird’s-eye view of all relationships among table’s attributes •Makes it less likely that will overlook an important dependency 35 Conversion to First Normal Form (continued) 36 Conversion to First Normal Form (continued) • First normal form describes tabular format in which: – All key attributes are defined – There are no repeating groups in the table – All attributes are dependent on primary key • All relational tables satisfy 1NF requirements • Some tables contain partial dependencies – Dependencies based on only part of the primary key – Sometimes used for performance reasons, but should be used with caution – Still subject to data redundancies 37 Conversion to Second Normal Form • Relational database design can be improved by converting the database into second normal form (2NF) • Two steps 38 Conversion to Second Normal Form (continued) • Step 1: Write Each Key Component on a Separate Line – Write each key component on separate line, then write original (composite) key on last line – Each component will become key in new table 39 Conversion to Second Normal Form (continued) • Step 2: Assign Corresponding Dependent Attributes – Determine those attributes that are dependent on other attributes – At this point, most anomalies have been eliminated 40 Conversion to Second Normal Form (continued) 41 Conversion to Second Normal Form (continued) • Table is in second normal form (2NF) when: – It is in 1NF and – It includes no partial dependencies: •No attribute is dependent on only portion of primary key 42 Conversion to Third Normal Form • Data anomalies created are easily eliminated by completing three steps • Step 1: Identify Each New Determinant – For every transitive dependency, write its determinant as PK for new table •Determinant – Any attribute whose value determines other values within a row 43 Conversion to Third Normal Form (continued) • Step 2: Identify the Dependent Attributes – Identify attributes dependent on each determinant identified in Step 1 and identify dependency – Name table to reflect its contents and function 44 Conversion to Third Normal Form (continued) • Step 3: Remove the Dependent Attributes from Transitive Dependencies – Eliminate all dependent attributes in transitive relationship(s) from each of the tables that have such a transitive relationship – Draw new dependency diagram to show all tables defined in Steps 1–3 – Check new tables as well as tables modified in Step 3 to make sure that each table has determinant and that no table contains inappropriate dependencies 45 Conversion to Third Normal Form (continued) 46 Conversion to Third Normal Form (continued) • A table is in third normal form (3NF) when both of the following are true: – It is in 2NF – It contains no transitive dependencies 47 The Boyce-Codd Normal Form (BCNF) • Every determinant in table is a candidate key – Has same characteristics as primary key, but for some reason, not chosen to be primary key • When table contains only one candidate key, the 3NF and the BCNF are equivalent • BCNF can be violated only when table contains more than one candidate key 48 The Boyce-Codd Normal Form (BCNF) (continued) • Most designers consider the BCNF as special case of 3NF • Table is in 3NF when it is in 2NF and there are no transitive dependencies • Table can be in 3NF and fails to meet BCNF – No partial dependencies, nor does it contain transitive dependencies – A nonkey attribute is the determinant of a key attribute 49 The Boyce-Codd Normal Form (BCNF) (continued) 50 The Boyce-Codd Normal Form (BCNF) (continued) 51 An Example GRADE( Student_ID, Student_Name, Address, Major, Course_ID, Course_Title, Instructor_Name, Instructor_Office, Grade) 52 Normalization and Database Design • Normalization should be part of the design process • E-R Diagram provides macro view • Normalization provides micro view of entities – Focuses on characteristics of specific entities – May yield additional entities • Difficult to separate normalization from E-R diagramming • Business rules must be determined 53 Higher-Level Normal Forms • Fourth Normal Form (4NF) – Table is in 3NF – Has no multiple sets of multivalued dependencies 54 Conversion to 4NF Stud-ID Course Service Stud-ID Course 1126 1212F Red Cross 1126 1212F 1126 1620F United Way 1126 1620F 1126 1320F 1126 1320F Stud-ID Service 1126 Red Cross 1126 United Way Stud-ID Course 1126 1212F 1126 1620F 1126 1320F Service 1126 Red Cross 1126 United Way Stud-ID Course Service 1126 1212F Red Cross 1126 1620F United Way 1126 1320F Multivalued Dependencies Set of Tables in 4NF 55 Denormalization • Normalization is one of many database design goals • Normalized table requires – Additional processing – Loss of system speed • Normalization purity is difficult to sustain due to conflict in: – Design efficiency – Information requirements – Processing 56 Exercise Part Supplier Data Part_ Description No 1234 Logic Chips 5678 Memory Chips Vendor_Name Fast Chips Smart Chips Unit_ Cost Cupertino 10.00 Phoenix 8.00 Fast Chips Quality Chips Cupertino Austin 3.00 2.00 Phoenix 5.00 Smart Chips Address 57 Exercise(Cont’d) • Convert the table to a relation in first normal form (Named Part Supplier) • List the functional dependency in the Part Supplier and identify a candidate key • For the relation Part Supplier, identify the followings: an insert anomaly, a delete anomaly, and a modification anomaly. • Draw a relation schema and show the functional dependencies • Develop a set of 3NF relations from Part Supplier 58