Chapter 4 Logical Database Design and the Relational Model Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 chen@jepson.gonzaga.edu Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-1 Objectives • • • • • • • Define terms List five properties of relations State two properties of candidate keys Define first, second, and third normal form Describe problems from merging relations Transform E-R and EER diagrams to relations Create tables with entity and relational integrity constraints • Use normalization to convert anomalous tables to well-structured relations Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-2 Problem Solving for Modeling a Database Project Study and Analyze w/Team Business Problem User interview & Integrated Model ER or EER or OO ????? ????? ????? Normalization (3NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-3 Relation • Definition: A relation is a ______, ______dimensional table of data – Table is made up of rows (records), and columns (attribute or field) • Not all tables qualify as relations • Requirements: – Every relation has a unique name. – Every attribute value is atomic (not multivalued, not composite) – Every row is unique (can’t have two rows with exactly the same values for all their fields) – Attributes (columns) in tables have unique names – The order of the columns is irrelevant – The order of the rows is irrelevant NOTE: all relations are in 1st Normal form Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-4 Correspondence with ER Model • Relations (tables) correspond with entity types and with many-to-many relationship types • Rows correspond with entity instances and with manyto-many relationship instances • Columns correspond with attributes • NOTE: The word relation (in relational database) is NOT the same as the word relationship (in ER model) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-5 Key Fields • Keys are special fields that serve two main purposes: – _______ keys are unique identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique – _______ keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship) • Keys can be simple (a single field) or composite (more than one field) • Surrogate key • Keys usually are used as indexes to speed up the response to user queries (More on this in Ch. 5) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-6 Sample E-R Diagram (Figure 2-1) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-7 Figure 4-3 Schema for four relations (Pine Valley Furniture Company) Primary Key Foreign Key (implements 1:N relationship between customer and order) Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-8 Fig. 4-3: Schema for four relations (Pine Valley Furniture) Graphical and Text Representations CUSTOMER(Customer_ID, Customer_name,Address, City,State,Zip) ORDER(Order_ID, Order_Date,Customer_ID) ORDER_LINE(Order_ID, Product_ID,Quantity) PRODUCT(Product_ID, Product_Description, Product_Finish,Unit_Price, On_Hand) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-9 Fig. 4-1: EMPLOYEE1 Relation with sample data EMPLOYEE1 EmpID Name DeptName Salary 100 140 110 190 150 Margaret Simpson Allen Beeton Chris Lucero Lorenzo Davis Susan Martin Marketing Accounting Info. System Finance Marketing 48,000 52,000 43,000 55,000 42,000 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-10 Fig. 4-2: Eliminating multi-valued attributes (a) Table with repeating groups or multi-valued attributes (Un-Normalized) EMPLOYEE EmpID Name DeptName 100 Margaret Simpson Marketing 140 110 Allen Beeton Chris Lucero Accounting Info. System 190 150 Lorenzo Davis Susan Martin Finance Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title Date Completed 48,000 SPSS 6/19/200X Surveys 10/7/200X 52,000 Tax Acc 12/8/200X 43,000 SPSS 1/12/200X C++ 4/22/200X 55,000 42,000 SPSS 6/16/200X Java 8/12/200X TM 4-11 Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem) EmpID Name DeptName 100 Margaret Simpson Marketing 140 110 Allen Beeton Chris Lucero Accounting Info. System 190 150 Lorenzo Davis Susan Martin Finance Marketing Salary Course Title 48,000 SPSS Surveys 52,000 Tax Acc 43,000 SPSS C++ 55,000 42,000 SPSS Java Date Completed 6/19/200X 10/7/200X 12/8/200X 1/12/200X 4/22/200X 6/16/200X 8/12/200X Figure 4-2 (b) EMPLOYEE2 relation EmpID Name DeptName 100 Margaret Simpson Marketing 100 Margaret Simpson Marketing 140 Allen Beeton Accounting 110 Chris Lucero Info. System 110 Chris Lucero Info. System 190 Lorenzo Davis Finance 150 Susan Martin Marketing 150 Martin Marketing Copyright © Addison Susan Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Date Title Completed 48,000 SPSS 6/19/200X 48,000 Surveys 10/7/200X 52,000 Tax Acc 12/8/200X 43,000 SPSS 1/12/200X 43,000 C++ 4/22/200X 55,000 42,000 SPSS 6/16/200X 42,000 Java 8/12/200XTM 4-12 Fig. 4-2: Eliminating multi-valued attributes (b) EMPLOYEE2 Relation (Normalized) EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title Date Completed 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 6/16/200X 8/12/200X SPSS 6/19/200X Surveys 10/7/200X Tax Acc 12/8/200X SPSS 1/12/200X C++ 4/22/200X SPSS Java TM 4-13 Constraints • Domain Constraints – Allowable values for an attribute. – A domain definition contains: domain name, data type, size, meaning, and allowable values/range (if applicable). • Entity Integrity – No primary key attribute may be null. • Referential Integrity – A relationship between primary key and foreign key. • Operational Constraints – Business rules (see Chapter 3) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-14 Integrity Constraints • _________ Integrity – rule that states that any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null) – For example: Delete Rules • Restrict – don’t allow delete of “parent” side if related rows exist in “dependent” side • Cascade – automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted e.g., DROP TABLE tablename CASCADE CONSTRAINTS; (p.110 of Oracle 11g) • Set-to-Null – set the foreign key in the dependent side to null if deleting from the parent side not allowed for weak entities Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-15 Fig. 4-5: Referential integrity constraints (Pine Valley Furniture) pk fk pk cpk/pk fk fk Referential integrity constraints are drawn via arrows from dependent to parent table pk Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-16 Figure 4-6 SQL table definitions Referential integrity constraints are implemented with foreign key to primary key references Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-17 Referential Integrity (Addition and Deletion) CUSTOMER Customer_ID Customer_Name Customer_Address 123 345 489 789 … John Smith Allen Stone Mary Jones Kent Watson S. 34 Freya, Spokane 45 A St. Cheney E. 21 Trent, Spokane 23 Y Ave. Spokane PK ORDER FK Order_ID Order_Date Customer_ID OR-002 OR-003 OR-004 OR-004 … 9/21/1998 9/30/1998 10/1/1998 10/2/1998 123 345 123 489 642 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-18 STUDENT pk SID SLNAME 100 Miller … 105 Connoly … 9188 FACULTY pk FID 1 … 5 fk FID 1 … 3 fk FLNAME Cox … Brown LOCATION pk LOCID BLDG_CODE 45 CR … 57 LIB pk (FACULTY) SPIN 8891 LOCID 53 … 57 ROOM 101 222 … FPIN 1181 … 9899 CAPACITY 150 … 1 fk (STUDENT) RULES: 1. You can’t add a record to the table with fk, (e.g., STUDENT) unless there is a corresponding record in the table with pk. 2. You can’t delete a record in the table with pk (e.g., FACULTY) if there is a record in the table with fk. Order of entering data into the database: LOCATION FACULTY STUDENT Order of deleting data from the database:STUDENT FACULTY LOCATION Well-Structured Relations • A well-structured relation contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies. • The following anomalies should be removed for a well-structured relation: – Insertion Anomaly – Deletion Anomaly – Modification Anomaly Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-20 Is EMPLOYEE2 a WellStructured relation? EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 Date Completed SPSS 6/19/200X Surveys 10/7/200X Tax Acc 12/8/200X SPSS 1/12/200X C++ 4/22/200X SPSS Java 6/16/200X 8/12/200X TM 4-21 EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title Date Completed 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 SPSS Surveys Tax Acc SPSS C++ 6/19/200X 10/7/200X 12/8/200X 1/12/200X 4/22/200X SPSS Java 6/16/200X 8/12/200X TM 4-22 Is EMPLOYEE2 a WellStructured relation? No/Yes WHY? Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-23 __________Anomaly: Inserting a new row, the user must supply values for both EmpID (PK) and CourseTitle (CPK and FK). This is an (insertion) anomaly, since the user should be able to enter employee data without knowing (supplying) course (title) data. EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Date Title Completed 48,000 SPSS 6/19/200X 48,000 Surveys 10/7/200X 52,000 Tax Acc 12/8/200X 43,000 SPSS 1/12/200X 43,000 C++ 4/22/200X 55,000 42,000 SPSS 6/16/200X 42,000 Java 8/12/200X TM 4-24 Deletion Anomaly: Deleting the employee number 140, it results in losing not only the employee’s information but also the course had an offering that completed on that date. EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Date Title Completed 48,000 SPSS 6/19/200X 48,000 Surveys 10/7/200X 52,000 Tax Acc 12/8/200X 43,000 SPSS 1/12/200X 43,000 C++ 4/22/200X 55,000 42,000 SPSS 6/16/200X 42,000 Java 8/12/200X TM 4-25 ________ Anomaly: If the employee number 100 gets a salary increase, we must record the increase in each of the rows for that employee (two occurences); otherwise the data will be inconsistent. EMPLOYEE2 EmpID Name 100 100 140 110 110 190 150 150 DeptName Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Date Title Completed 48,000 SPSS 6/19/200X 48,000 Surveys 10/7/200X 52,000 Tax Acc 12/8/200X 43,000 SPSS 1/12/200X 43,000 C++ 4/22/200X 55,000 42,000 SPSS 6/16/200X 42,000 Java 8/12/200X TM 4-26 Fig. 4-7: EMP_COURSE: Normalized Relations from EMPLOYEE2 EMPLOYEE2 EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Margaret Simpson Allen Beeton Chris Lucero Chris Lucero Lorenzo Davis Susan Martin Susan Martin Marketing Marketing Accounting Info. System Info. System Finance Marketing Marketing Salary Course Title 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 SPSS 6/19/201X Surveys 10/7/201X Tax Acc 12/8/201X SPSS 1/12/201X C++ 4/22/201X SPSS Java EMPLOYEE1 EmpID 100 140 110 190 150 Name Margaret Simpson Allen Beet Chris Lucero Lorenzo Davis Sususan Martin Date Completed 6/16/201X 8/12/201X EMP_COURSE DeptName Salary Marketing Accounting Info. System Finance Marketing 48,000 52,000 43,000 55,000 42,000 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Is there any anomaly ? EmpID 100 100 140 110 110 150 150 ?? Course Date Title Completed SPSS 6/19/201X Surveys 10/7/201X Tax Acc 12/8/201X SPSS 1/12/201X C++ 4/22/201X SPSS 6/19/201X Java 8/12/201X TM 4-27 Problem Solving for Modeling a Database Project Study and Analyze w/Team Business Problem User interview & Integrated Model ER or EER or OO Relations Transformation (Seven Cases) Transformation to Relations ????? Normalization (3NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-28 Seven Cases of Transforming EE-R Diagrams into Relations 1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-29 Transforming EE-R Diagrams into Relations 1. Map Regular Entities to Relations E-R attributes map directly onto the relation (Fig. 4-8) Composite attributes: Use only their simple, component attributes (Fig. 4-9). Multi-valued Attribute : Becomes a separate relation with a foreign key taken from the superior entity (Fig. 4-10). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-30 Fig. 4-8: Mapping the regular entity CUSTOMER (a) CUSTOMER entity type (b) CUSTOMER relation [Same name on relation and entity type] Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-31 Figure 4-9 Mapping a composite attribute (a) CUSTOMER entity type with composite attribute (b) CUSTOMER relation with address detail [Use only their simple, component attributes] Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-32 Fig. 4-10: Mapping an entity with a multivalued attribute (a) Employee entity type with multivalued attribute Multivalued attribute becomes a separate relation with foreign key (b) Mapping a multivalued attribute EMPLOYEE EmployeeID EmployeeID EmployeeName EmployeeAddress Skill One–to–many relationship between original entity and new relation Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems [Two relations created with one containing all of the attributes except the multivalued attribute, and the second one contains the pk (on the first one) and the multivalued attribute] TM 4-33 Break ! (Ch. 4 - Part I) In class exercise #1-I (a) (p.193), apply Figure 2-8. HW # 1-I (c), apply Figure 2-11.a Fig. 3-8: Figure 2-8 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-34 Problem Solving for Modeling a Database Project Study and Analyze w/Team Business Problem User interview & Integrated Model ER or EER or OO Relations Transformation (Seven Cases) Transformation to Relations ????? Normalization (3NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-35 Seven Cases of Transforming EE-R Diagrams into Relations 1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-36 Transforming EER Diagrams into Relations 2. Mapping Weak Entities – Becomes a separate relation with a foreign key taken from the superior entity – Primary key composed of: • Partial identifier of weak entity • Primary key of identifying relation (strong entity) (Fig. 4-11) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-37 Fig. 4-11: Example of mapping a weak entity (a) Weak entity DEPENDENT Fig. 4-11: (b) Relations resulting from weak entity [Becomes a separate relation with a foreign key taken from the superior entity] PK CPK NOTE: the domain constraint for the foreign key should NOT allow null value if DEPENDENT is a weak entity FK Question: Why need all FOUR attributes to be a CK? Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-38 Transforming EE-R Diagrams Into Relations 3. Map Binary Relationships – One-to-Many - Primary key on the one side becomes a foreign key on the many side (Fig. 4-12). – Many-to-Many - Create a new relation with the primary keys of the two entities as its primary key (Fig. 4-13). – One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side (Fig. 4-14). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-39 Fig. 4-12: Example of mapping a 1:M relationship (a) Relationship between customers and orders Note the mandatory one Fig. 4-12: (b) Mapping the relationship [Primary key on the one side becomes a foreign key on the many side] Foreign key Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Again, no null value in the foreign key…this is because of the mandatory minimum cardinality TM 4-40 Figure 4-13 Example of mapping an M:N relationship a) Completes relationship (M:N) The Completes relationship will need to become a separate relation Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-41 Figure 4-13 Example of mapping an M:N relationship (cont.) b) Three resulting relations Composite primary key (cpk) Foreign key Foreign key Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems New intersection relation TM 4-42 Figure 4-14 Example of mapping a binary 1:1 relationship a) In_charge relationship (1:1) mandatory optional Often in 1:1 relationships, one direction is optional. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-43 Fig. 4-14: (b) Resulting relations mandatory Same domain as Nurse_ID PK optional FK [Primary key on the mandatory side becomes a foreign key on the optional side] Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-44 Transforming EE-R Diagrams Into Relations 4. Map Associative Entities – Identifier Not Assigned • Default primary key for the association relation is composed of the primary keys of the two entities (as in M:N relationship) (Fig. 4-15) – Identifier Assigned • It is natural and familiar to end-users. • Default identifier may not be unique. (Fig. 4-16). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-45 Figure 4-15 Example of mapping an associative entity B A a) An associative entity A B A Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-46 Figure 4-15 Example of mapping an associative entity (cont.) b) Three resulting relations PK fk [Default primary key for the association relation is the primary keys of the two entities] cpk fk PK Composite primary key formed from the two foreign keys Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-47 Figure 4-16: Mapping an associative entity (a) Associative entity (SHIPMENT) [Default primary key for the association relation is assigned] (b) Three resulting relations Primary key differs from foreign keys Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-48 Transforming EE-R Diagrams Into Relations 5. Map Unary Relationships – One-to-Many • Recursive foreign key in the same relation (Fig. 4-17). • A recursive FK is a FK in a relation that references the PK values of that same relation. It must have the same domain as the PK. – Many-to-Many - Bill-of-materials: Two relations: • One for the entity type. • One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. (Fig. 4-18). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-49 Figure 4-17 Mapping a unary 1:N relationship (a) EMPLOYEE entity with unary relationship A recursive FK is a FK in a relation that references the PK values of that same relation. It must have the same domain as the PK. (b) EMPLOYEE relation with recursive foreign key EMPLOYEE EmployeeID EmployeeName EmployeeDateOfBirth Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems ManagerID TM 4-50 Figure 4-18: Mapping a unary M:N relationship One for the entity type. One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. (a) Bill-ofmaterials relationships (M:N) (b) ITEM and COMPONEN T relations ITEM PK ItemNo fk ItemDescription COMPONENT ItemNo Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems cpk ItemUnitCost fk ComponentNo Quantity TM 4-51 Transforming EE-R Diagrams Into Relations 6. Map Ternary (and n-ary) Relationships – One relation for each entity and one for the associative entity. – Associative entity has foreign keys to each entity in the relationship – (Fig. 4-19). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-52 Figure 4-19 Mapping a ternary relationship a) PATIENT TREATMENT Ternary relationship with associative entity Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-53 Figure 4-19 Mapping a ternary relationship (cont.) b) Mapping the ternary relationship PATIENT TREATMENT A patient may receive a treatment one in the morning, then the same treatment in the afternoon. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-54 Figure 4-19 Mapping a ternary relationship (cont.) b) Mapping the ternary relationship PATIENT TREATMENT Remember that This is why the primary treatment date and key MUST be time are included in unique the composite primary key But this makes a very cumbersome key… Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems It would be better to create a surrogate key like Treatment# How to create it with Oracle? TM 4-55 Transforming EER Diagrams into Relations 7. Mapping Supertype/Subtype Relationships – One relation for supertype and for each subtype – Supertype attributes (including identifier and subtype discriminator) go into supertype relation – Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation – 1:1 relationship established between supertype and each subtype, with supertype as primary table (Fig. 420). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-56 Figure 4-20 Supertype/subtype relationships Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-57 Figure 4-21 Mapping Supertype/subtype relationships to relations These are implemented as oneto-one relationships Display a table that contains all the attributes for SALARIED_EMPLOYEE Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems SELECT * FROM EMPLOYEE, SALARIED_EMPLOYEE WHRE Employee_Number= S_Employee_Number; TM 4-58 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-59 Break ! (Ch. 4 - Part II) Figure 3-6(b) In class exercise Transform it to relations (NOT 3NF) #2-III-a , (p.193, apply Figure 3-6.b (read step 7, on p.176) HW #2-III-d, (p.193), apply Figure 3-10 Partial Specialization Fig. 3-8: Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-60 Break ! (Ch. 4 - Part II) In class exercise (another set) (p.193) Transform it to relations (NOT 3NF) #2-III (b) (Figure 3-7a) (see next slide) HW #2-III (c) (Figure 3-9) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-61 Fig. 3-7: Examples of disjointness constraints (a) Disjoint rule Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-62 Fig. 3-9: Subtype discriminator (overlap rule) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-63 Problem Solving for Modeling a Database Project Study and Analyze w/Team Business Problem User interview & Integrated Model ER or EER or OO Relations Transformation (Seven Cases) Transformation to Relations Normalization Normalization (3NF) IMPLEMENTATION Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-64 Seven Cases of Transforming EE-R Diagrams into Relations 1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-65 Next Topic • Next topic is the most important topic (theory) in this database management class. • What is it? • Normalization Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-66 Data Normalization • The process of decomposing relations with anomalies to produce smaller, wellstructured and stable relations • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-67 Well-Structured Relations • A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies • Goal is to avoid (minimize) anomalies – Insertion Anomaly – adding new rows forces user to create duplicate data – Deletion Anomaly – deleting rows may cause a loss of data that would be needed for other future rows – Modification Anomaly – changing data in a row forces changes to other rows because of duplication General rule of thumb: a table should not pertain to more than one entity type Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-68 Example – Figure 4.2b Question – Is this a relation? Answer – Yes: unique rows and no multivalued attributes Question – What’s the primary key? Answer – EmpID and CourseTitle a composite key Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-69 Anomalies in this Table • Insertion – can’t enter a new employee without having the employee take a class • Deletion – if we remove employee 140, we lose information about the existence of a Tax Acc class • Modification – giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because are two themes (entity types – what are they?) in this one relation (two themes, entity types, were combined). This results in duplication, and an unnecessary dependency between the entities Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-70 Functional Dependencies and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key – A unique identifier. One of the candidate keys will become the primary key • E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys – Each non-key field is functionally dependent on every candidate key Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-71 Figure 4-23: Representing Functional Dependencies (cont.) (a) Functional dependencies in EMPLOYEE1 Fig. 4-2a) (b) Functional dependencies in EMPLOYEE2 (Fig. 4-2b) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-72 First Normal Form • No multivalued attributes • Every attribute value is atomic (singledvalue) • Fig. 4-2a is not in 1st Normal Form (multivalued attributes) it is not a relation • Fig. 4-2b is in 1st Normal form (but not in a well-structured relation) • All relations are in 1st Normal Form • The following example is not from the text will be illustrated for Normalization process. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-73 Figure 4-2 (a) EmpID Name DeptName Salary 100 Margaret Simpson Marketing 48,000 140 110 Allen Beeton Chris Lucero Accounting Info. System 52,000 43,000 190 150 Lorenzo Davis Susan Martin Finance Marketing 55,000 42,000 Course Date Title Completed SPSS 6/19/200X Surveys 10/7/200X Tax Acc 12/8/200X SPSS 1/12/200X C++ 4/22/200X SPSS Java 6/16/200X 8/12/200X Figure 4-2 (b) EmpID Name 100 100 140 110 110 190 150 150 DeptName Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title 48,000 SPSS 48,000 Surveys 52,000 Tax Acc 43,000 SPSS 43,000 C++ 55,000 42,000 SPSS 42,000 Java Date Completed 6/19/200X 10/7/200X 12/8/200X 1/12/200X 4/22/200X 6/16/200X 8/12/200X TM 4-74 Second Normal Form • 1NF and every non-key attribute is fully functionally dependent on the primary key. • Every non-key attribute must be defined by the entire key (either a single PK or a CPK), not by only part of the key. • No partial functional dependencies. • Fig. 4-2b is NOT in 2nd Normal Form Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-75 Figure: 4-22 Steps in normalization Table with Multivalued attributes Remove Multivalued Attributes First normal form (1NF) Second normal form(2NF) Remove _____ Dependencies Remove … Third normal form (3NF) Boyce-Codd normal form (BC-NF) Fourth normal Form (4NF) Fifth normal form (5NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Remove remaining anomalies resulting from multiple candidate keys Remove Multivalued Dependencies Remove Remaining Anomalies TM 4-76 A Process of 1NF to 2NF (EMPLOYEE2 - - 1NF) (b) Functional Dependencies in EMPLOYEE2 Dependency on entire primary key EmpID CourseTitle Name DeptName Salary DateCompleted Dependency on only part of the key EmpID Name DeptName 100 100 140 110 110 190 150 150 Margaret Simpson Marketing Margaret Simpson Marketing Allen Beeton Accounting Chris Lucero Info. System Chris Lucero Info. System Lorenzo Davis Finance Susan Martin Marketing Susan Martin Marketing Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salary Course Title Date Completed 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 SPSS Surveys Tax Acc SPSS C++ 6/19/200X 10/7/200X 12/8/200X 1/12/200X 4/22/200X SPSS Java 6/16/200X 8/12/200X TM 4-77 Functional Dependencies in EMPLOYEE2 Dependency on entire primary key EmpID CourseTitle Name DeptName Salary DateCompleted Dependency on only part of the key (partial dep.) EmpID, CourseTitle DateCompleted EmpID Name, DeptName, Salary Therefore, NOT in 2nd Normal Form!! Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-78 EMPLOYEE2 EmpID Summary on Normalization: from 1NF to 2NF CourseTitle Name DeptName Salary DateCompleted Partial Depend. EMPLOYEE1 EmpID Name DeptName Salary EMP_COURSE 2NF 3NF ? EmpID Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems CourseTitle DateCompleted TM 4-79 Summary on Normalization EMPLOYEE2 (1NF) EmpID Name DeptName Salary Course Title 100 100 140 110 110 190 150 150 Margaret Simpson Margaret Simpson Allen Beeton Chris Lucero Chris Lucero Lorenzo Davis Susan Martin Susan Martin Marketing Marketing Accounting Info. System Info. System Finance Marketing Marketing 48,000 48,000 52,000 43,000 43,000 55,000 42,000 42,000 EMPLOYEE1 EmpID 100 140 110 190 150 Name Margaret Simpson Allen Beet Chris Lucero Lorenzo Davis Sususan Martin SPSS 6/19/200X Surveys 10/7/200X Tax Acc 12/8/200X SPSS 1/12/200X C++ 4/22/200X SPSS Java Salary Marketing Accounting Info. System Finance Marketing 48,000 52,000 43,000 55,000 42,000 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Is there any anomaly (ies)? 6/16/200X 8/12/200X EMP_COURSE 3NF DeptName Date Completed EmpID Course Title Date Completed 100 100 140 110 110 150 150 SPSS 6/19/200X Surveys 10/7/200X Tax Acc 12/8/200X SPSS 1/12/200X C++ 4/22/200X SPSS 6/19/200X Java 8/12/200X TM 4-80 Third Normal Form • 2NF and no transitive dependencies (functional dependency between non-key attributes.) • Other examples Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-81 Figure: 4-22 Steps in normalization Table with Multivalued attributes Remove Multivalued Attributes First normal form (1NF) Second normal form(2NF) Third normal form (3NF) Boyce-Codd normal form (BC-NF) Fourth normal Form (4NF) Fifth normal form (5NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Remove Partial Dependencies Remove _________ Dependencies Remove remaining anomalies resulting from multiple candidate keys Remove Multivalued Dependencies Remove Remaining Anomalies TM 4-82 Extra example-1: Relation with transitive dependency (a) SALES relation with simple data SALES Cust_ID Name Salesperson Region 8023 Anderson 101 South 9167 Bancroft 102 West 7924 Hobbs 101 South 6837 Tucker 103 East 8596 Eckersley 102 West 7018 Arnold 104 North Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-83 What Anomalies might be in SALES relation? WHY? Because it is … • Insertion anomaly ? • Deletion anomaly ? • Modification anomaly ? Not in the 3NF (why?) (transitive dependency) SALES Cust_ID Name Salesperson Region 8023 Anderson 101 South 9167 Bancroft 102 West 7924 Hobbs 101 South 6837 Tucker 103 East 8596 Eckersley 102 West 7018 Arnold 104 North Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems N TM 4-84 Extra example-1: Relation with transitive dependency CustID Name CustID Salesperson CustID Region All this is OK (2nd NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-85 Extra example-1: Relation with transitive dependency CustID Name CustID Salesperson CustID Region and Salesperson Region All this is OK (2nd NF) BUT CustID Salesperson Region implies CustID Region Transitive dependency (not in 3rd NF) TM 4-86 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Extra example-1: (b) Relations in 3NF Remove a transitive dependency Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-87 Extra example-1: Removing a transitive dependency (a) Decomposing the SALES relation SALES1 S_PERSON Cust_ID Name Salesperson 8023 Anderson 101 101 South 9167 Bancroft 102 102 West 7924 Hobbs 101 103 East 6837 Tucker 103 104 North 8596 Eckersley 102 7018 Arnold 104 Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Salesperson Region TM 4-88 Extra example-2: Relation transitive dependencies Snum Origin Destination Distance 409 618 723 824 629 Seattle Chicago Boston Denver Minneapolis Denver Dallas Atlanta Los Angeles St. Louis 1,537 1,058 1,214 1,150 587 ?NF Insertion anomaly? Deletion anomaly? Modification anomaly? Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-89 Extra example-2: Relation transitive dependencies SHIPMENT Snum Origin Destination Distance Snum Origin 409 618 723 824 629 Seattle Chicago Boston Denver Minneapolis Destination Denver Dallas Atlanta Los Angeles St. Louis Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems ?NF Origin Destination Distance Seattle Chicago Boston Denver Minneapolis Denver Dallas Atlanta Los Angeles St. Louis 1,537 1,058 1,214 1,150 587 TM 4-90 EMPLOYEE2 EmpID Summary on Normalization: from 1NF to 2NF CourseTitle Name DeptName Salary DateCompleted Partial Depend. EMPLOYEE1 EmpID Name DeptName Salary EMP_COURSE 2NF 3NF ? EmpID Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems CourseTitle DateCompleted TM 4-91 Figure: 4-22 Steps in normalization Table with Multivalued attributes Remove ________ Attributes First normal form (1NF) Second normal form(2NF) Third normal form (3NF) Boyce-Codd normal form (BC-NF) Fourth normal Form (4NF) Fifth normal form (5NF) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems Remove _______ Dependencies Remove ______ Dependencies Remove remaining anomalies resulting from multiple candidate keys Remove Multivalued Dependencies Remove Remaining Anomalies TM 4-92 Steps of Database Development User view-1 User view-2 User view-3 … … User view-N … User interview & Integrated Model Conceptual Schema (Model) Logical Model (ERD or E/ERD) (Seven) Relations ________ (more relations produced) _____________ (up to 3NF) (more tables created) Implementation (w/Physical Model) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-93 END of CHAPTER 4 In class exercise (p.193) #3- a,b,c,d HW (using Visio) (p.193-194) #7 - a,b,c,d,e,f Bonus #8 – a,b,c,d In-class Quiz next class Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-94 MVC_Hospital HW Logical Design Phase Draw a entity-relationship diagram (enterprise model) for Mountain View community Hospital, based on the narrative description of the case and this handout (but the entities are from the five (5) figures shown above). You should create a file and turn in with a hardcopy (called MVC_Hospital_DD.docx) contains the following materials: 1. Read and employ materials from chapters 2,3 and 4. 2. Include entities, associations (with detail cardinality), and attributes. 3. Determine and draw the order of entering data Next phase -- implementation, create SQL script file for table structure and data base (values). Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-95 Hint: You need to create VIEW (one or more) to help you create SQL efficiently and effectively See sample on the Bb -- version 1 for charge_view that includes Patient Name CREATE OR REPLACE VIEW charge_view(Patient_No,Patient _Name, Item_Code, Charge) AS SELECT patient.patient_no, patient.p_first|| ' ' ||patient.p_last, pt_charg.item_code, (charge*num_times_admitted) FROM patient, pt_charg, item WHERE item.item_code = pt_charg.item_code AND patient.patient_n o = pt_charg.patient_no ORDER BY patient.p_last; Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-96 MVC_Hospital Create two script files: 1. a script file (MVC_Hospital_Lastname_Firstname.SQL) that contains a set of commands of DROP, CREATE, and INSERT that performs the same functions as in the script file of Northwoods.sql 2. Second script file (MVC_Hospital_QUERIES_Lastname_Firstname.SQL) containing a set of SQL commands that answer the questions. Test the query one/time successfully. Note that you may need other SQL commands and create database views (see pptx file for introducing VIEWS) for the purpose of answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapter 7 of McFadden). 3. Spool (2) and save it in the file MVC_Hospital_Spool_Lastname_Firstname.txt Finally, you create a new file (*.docx) containing all work done from Part I and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx. The file should contain your class information and personal information. 4. UPLOAD the .docx file to the Bb by the deadline. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-97 Normalized vs. De-normalized • We will study the concept and technique of “normalization and de-normalization” as well as OLTP and OLAP. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-98 More on OLTP vs. OLAP pk fk pk pk: primary key fk: foreign key Fig. Extra-a: A simple database with a relation between two tables. For those have database background. • The figure depicts a relational database environment with two tables. • The first table contains information about pet owners; the second, information about pets. The tables are related by the single column they have in common: Owner_ID. • By relating tables to one another, we can reduce ____________ of data and improve database performance. • The process of breaking tables apart and thereby reducing data redundancy is called _______________. pk OLTP vs. OLAP (cont.) fk pk • Most relational databases which are designed to handle a high number of reads and writes (updates and retrievals of information) are referred to as ________ (OnLine Transaction Processing) systems. • OLTP systems are very efficient for high volume activities such as cashiering, where many items are being recorded via bar code scanners in a very short period of time. • However, using OLTP databases for analysis is generally not very efficient, because in order to retrieve data from multiple tables at the same time, a query containing ________ must be used. OLTP vs. OLAP (cont.) • In order to keep our transactional databases running quickly and smoothly, we may wish to create a data warehouse. A data warehouse is a type of large database (including both current and historical data) that has been _____________ and archived. • Denormalization is the process of intentionally combining some tables into a single table in spite of the fact that this may introduce duplicate data in some columns. Fig. Extra-b: A combination of the tables into a single dataset. • The figure depicts what our simple example data might look like if it were in a data warehouse. When we design databases in this way, we reduce the number of joins necessary to query related data, thereby speeding up the process of analyzing our data. • Databases designed in this manner are called __________ (OnLine Analytical Processing) systems. Merging Relations (View Integration) • In a project development process, there may be a number of separate E-R diagrams and user views created and some of them may be redundant. • Therefore, some relations should be merged to remove the redundancy. Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-102 Merging Relations (View Integration - An example) EMPLOYEE1( EmployeeID, Name, Address, Phone) EMPLOYEE2(EmployeeID, Name, Address, Jobcode, No_Years) EMPLOYEE(EmployeeID, Name, Address, Phone, Jobcode, No_Years) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-103 Merging Relations (Problems on View Integration) Issues to watch out for when merging entities from different ER models: Synonyms: Different names, same meaning. Homonyms: Same name, different meanings. Transitive Dependencies: dependencies–even if relations are in 3NF prior to merging, they may not be after merging Supertype/Subtype May be hidden prior to merging Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-104 Problems on View Integration • Synonyms: Different names, same meaning. STUDENT1(StudentID, Name) STUDENT2(MatriculationNo,Name, Address) STUDENT(SSN, Name, Address) • Homonyms: Same name, different meanings. STUDENT1(StudentID, Name,Address) STUDENT2(StudentID,Name, Phone_No,Address) STUDENT(StudentID,Name, Phone_No, Campus_Address, Permanent_Address) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-105 Problems on View Integration • Transitive Dependencies STUDENT1(StudentID, Major) STUDENT2(StudentID, Advisor) the result is ... STUDENT(StudentID, Major, Advisor) ??NF and after removing transitive dependency STUDENT_Major(StudentID, Major) Major_Advisor(Major, Advisor) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-106 Problems on View Integration • Supertype/Subtype PATIENT1(PatientID, Name, Address) PATIENT2(PatientID, Room_No) PATIENT(PatientID, Name, Address) Two subtypes are hidden prior to merging RESIDENT_PATIENT(PatientID, RoomNo) OUTPATIENT(PatientID, DateTreated) Copyright © Addison Wesley Longman, Inc. & Dr. Chen, Business Database Systems TM 4-107