Database Design Chapter 9 Part-2: Normalization 1 Outline Part Two Normalization − 1NF − 2NF − 3NF Integrity Controls 2 Study the problems with this table. This is why we need Normalization! Update Anomaly Carla called to inquire about her appointment time on 12/2/2015. She also gives us her new phone number: 777-1234 12/2/2015 Delete Anomaly Mike called to cancel his appointment. Insert Anomaly Sue calls to make another appointment for December 15th. Appointment Table Appt No 1 2 3 4 5 6 7 8 9 10 Appt Date 12/1/2015 12/1/2015 12/1/2015 12/2/2015 12/2/2015 12/2/2015 12/2/2015 12/2/2015 12/3/2015 12/3/2015 Appt Planned Appt Time Duration Type 3:00 AM 1.00 Physical 3:00 AM 0.25 Shot 3:15 AM 0.50 Flu 10:00 AM 0.50 Migraine 10:15 AM 0.25 Shot 10:30 AM 0.25 Shot 10:45 AM 0.50 Flu 11:00 AM 1.00 Physical 10:30 AM 1.00 Physical 9:00 AM 0.50 Migraine Patient ID 466927 456789 194756 329657 987453 384788 438754 345875 466927 345875 First Nm Lisa Sue Brandon Marcus Mike Tonya Iliana Carla Lisa Carla Last Nm Garcia Carey Pierre Schwartz Jones Johnson Hnatt Basich Garcia Basich Phone 562-3456 432-1234 432-7877 239-5502 456-0202 432-8806 823-4303 857-5566 562-3456 857-5566 Doctor ID C678 A528 S626 A528 G123 S626 C678 A528 C678 C678 Doctor Nm Chapman Lopez Smith Lopez Gray Smith Chapman Lopez Chapman Chapman 3 Normalization (def) the process of converting complex data structures into simple, stable data structures. Every non-primary key attribute depends upon the whole primary key and nothing but the primary key. Purpose: Create well-structured relations/tables Why? Data Redundancy = Data Quality & Integrity Result: where we can insert, modify, and delete the rows without errors or inconsistencies. 4 Normalization Normalization Normal Form Definition 1NF The table has no repeating fields or groups of fields 2NF No Partial Dependencies! The table is in 1NF, and … every non-key attribute is functionally dependent on the entire primary key 3NF No Transitive Dependencies! The table is in 2NF, and … each non-key attribute is not functionally dependent on another non-key attribute First Normal Form (1NF) The table has: No multi-valued attributes − EMP (ssn, name, dept, sal, dependents) No repeating fields or groups of fields − EMP (ssn, name, dept, sal, dep1, dep2, dep3, … depN) 6 Functional Dependency BA Field A is functionally dependent on Field B if for each value of B, there is only 1 corresponding value of A Alternatively, it can be read as “Field B determines Field A” − If we know the value of B, we can obtain the value of A. Functional Dependency involves a one-to-one relationship between the values of fields • EmpNo Salary • Salary EmpNo • StudentId Name • HomeTown LastName Notes: − Sample Data does not prove the existence of a functional dependency. − Knowledge of problem domain is critical! 7 Second Normal Form (2NF) in 1NF and if each non-key attribute is functionally 1NF dependent on entire primary key What is the PK? ______________ Need a new table since CatIssueDate is determined only by ______________ 2NF 8 Third Normal Form (3NF) in 2NF and if no non-key attribute is functionally 2NF dependent on any other non-key attribute What is the PK? ______________ Need a new table since State is determined by ______________ 3NF 9 Employee Data Courses Taken Listing EmpNo 100 Name John Dept Salary MKTG 42000 Course Date Completed Dept Phone Ext. SPSS 6/19/2013 5325 Surveys 11/3/2014 5325 140 Sue ACCT 41000 Tax Acc 12/1/2014 4422 110 Bob INFO 70000 Java 3/21/2013 7373 C# 10/23/2015 7373 190 Alex FINA 65000 Investmts. 2/1/2013 4477 150 Jennifer INFO 49000 Java 3/11/2012 7373 Oracle 5/19/2013 7373 Notice in this report that Employees can take multiple courses. 10 First Normal Form(1NF) No repeating fields or groups! Note: notice the Primary Key! EmpCourse EmpNo Name Dept Salary Course Date Completed Dept Phone Ext. 100 John MKTG 42000 SPSS 6/19/2013 5325 100 John MKTG 42000 Surveys 11/3/2014 5325 140 Sue ACCT 41000 Tax Acc 12/1/2014 4422 110 Bob INFO 70000 Java 3/21/2013 7373 110 Bob INFO 70000 C# 10/23/2015 7373 190 Alex FINA 65000 Investmts. 2/1/2013 4477 150 Jennifer INFO 49000 Java 3/11/2012 7373 150 Jennifer INFO 49000 Oracle 5/19/2013 7373 EmpCourse(EmpNo, Name, Dept, Salary, Course, Date Completed, DeptPhoneExt) 11 Let’s transform this table from 1NF to 2NF Ensure that all non-key attributes are functionally dependent on the entire primary key! EmpCourse EmpNo Name Dept Salary Course Date Completed Dept Phone Ext. 100 John MKTG 42000 SPSS 6/19/2013 5325 100 John MKTG 42000 Surveys 11/3/2014 5325 140 Sue ACCT 41000 Tax Acc 12/1/2014 4422 110 Bob INFO 70000 Java 3/21/2013 7373 110 Bob INFO 70000 C# 10/23/2015 7373 190 Alex FINA 65000 Investmts. 2/1/2013 4477 150 Jennifer INFO 49000 Java 3/11/2012 7373 150 Jennifer INFO 49000 Oracle 5/19/2013 7373 EmpCourse(EmpNo, Name, Dept, Salary, Course, Date Completed, DeptPhoneExt) EmpNo, Course EmpNo Course 12 2NF NOW, all non-key attributes are functionally dependent on the entire primary key! Employee EmpNo Name EmpCourse Dept Salary Dept Phone Ext. EmpNo Course Date Completed 100 John MKTG 42000 5325 100 SPSS 6/19/2013 140 Sue ACCT 41000 4422 100 Surveys 11/3/2014 110 Bob INFO 70000 7373 140 Tax Acc 12/1/2014 190 Alex FINA 65000 4477 110 Java 3/21/2013 150 Jennifer INFO 49000 7373 110 C# 10/23/2015 190 Investmts. 2/1/2013 150 Java 3/11/2012 150 Oracle 5/19/2013 Employee (EmpNo, Name, Dept, Salary, DeptPhoneExt) EmpCourse( EmpNo, Course, Date Completed 13 Not in 3NF Ensure that all non-primary key attributes do not depend on each other (i.e. no transitive dependencies). Employee EmpNo Name EmpCourse Dept Salary Dept Phone Ext. EmpNo Course Date Completed 100 John MKTG 42000 5325 100 SPSS 6/19/2013 140 Sue ACCT 41000 4422 100 Surveys 11/3/2014 110 Bob INFO 70000 7373 140 Tax Acc 12/1/2014 190 Alex FINA 65000 4477 110 Java 3/21/2013 150 Jennifer INFO 49000 7373 110 C# 10/23/2015 190 Investmts. 2/1/2013 150 Java 3/11/2012 150 Oracle 5/19/2013 Do any of the non-key attributes (Name, Dept, or Salary) determine the phone number? __________ DeptPhoneExt This one is already in 3NF since it has only has one non-key attribute! 14 3NF NOW, all non-primary key attributes do not depend on each other Employee EmpNo Name EmpCourse Dept Salary EmpNo Course Dept Date Completed Dept Dept Phone Ext. 100 John MKTG 42000 100 SPSS 6/19/2013 MKTG 5325 140 Sue ACCT 41000 100 Surveys 11/3/2014 ACCT 4422 110 Bob INFO 70000 140 Tax Acc 12/1/2014 INFO 7373 190 Alex FINA 65000 110 Java 3/21/2013 FINA 4477 150 Jennifer INFO 49000 110 C# 10/23/2015 190 Investmts. 2/1/2013 150 Java 3/11/2012 150 Oracle 5/19/2013 Employee(EmpNo, Name, Dept, Salary) EmpCourse(EmpNo, Course, Date Completed Department(Dept, DeptPhoneExt) 15 Using Normalization we learned… that to represent the data on this report in a relational database, we will need to create 3 separate tables. EmpNo Name Dept Salary 100 John MKTG 42000 140 Sue ACCT 41000 110 Bob INFO 70000 190 Alex FINA 65000 150 EmpNo INFO Course Jennifer 49000 100 SPSS 6/19/2013 100 Surveys 11/3/2014 140 Tax Acc 12/1/2014 110 Java 3/21/2013 110 C# 10/23/2015 190 Dept MKTG Date Completed Investmts. Dept 150 Java Phone Ext. 150 Oracle 5325 ACCT 4422 INFO 7373 FINA 4477 2/1/2013 3/11/2012 5/19/2013 16 Here's our solution! 0..* EmpCourse EmpNo PK Course DateCompleted 1..1 Employee EmpNo - PK Name DeptCode Salary 1..* What are the foreign keys? 1..1 Dept DeptCode - PK DeptPhoneExt 17 Integrity Controls Input Controls prevent erroneous data − Value limit controls − Completeness controls − Data validation controls − Field combination controls Output Controls accurate, current, and complete − Physical access to output devices − Discarded output − Completeness • pg 1 of 10 − Currency • date & timestamp 18 Integrity Controls Access Controls SIUD; Views Complex update controls Handling multi-user updates – DBMS handles Protection from hardware failures Backup & Recovery Redundancy Transaction Logging 2 benefits 19 Integrity Controls Fraud Prevention Fraud Triangle Techniques to risk Separation of duties Records & Audit trails Monitoring − Unusual transactions Asset control − Limit physical access Security 20 Normalization Practice Order No Cust No Name Addr City St Zip Order Date 61384 1273 Cont. Designs 123 Oak Austin TX 28384 11/04/13 61384 1273 Cont. Designs 123 Oak Austin TX 28384 61384 1273 Cont. Designs 123 Oak Austin TX 62890 3891 J Consultants 523 Pine Waco 62890 3891 J Consultants 523 Pine 63129 1273 Cont. Designs 123 Oak Promised Date Prod No Desc Qty Ord Unit Price 11/21/13 M128 Bookcase 4 200 11/04/13 11/21/13 B381 Cabinet 2 150 28384 11/04/13 11/21/13 R210 Table 1 500 TX 76712 11/15/13 11/21/13 A891 Chair 2 300 Waco TX 76712 11/15/13 11/21/13 M128 Bookcase 8 200 Austin TX 28384 12/10/13 12/29/13 A891 Chair 6 300 It’s already in 1NF …but you need to understand why! Convert this to 2NF. How many tables do you now have? Convert this to 3NF. How many tables do you now have? 21 Normalization Practice Appointment Table Appt No 1 2 3 4 5 6 7 8 9 10 Appt Date 12/1/2015 12/1/2015 12/1/2015 12/2/2015 12/2/2015 12/2/2015 12/2/2015 12/2/2015 12/3/2015 12/3/2015 Appt Planned Appt Time Duration Type 3:00 AM 1.00 Physical 3:00 AM 0.25 Shot 3:15 AM 0.50 Flu 10:00 AM 0.50 Migraine 10:15 AM 0.25 Shot 10:30 AM 0.25 Shot 10:45 AM 0.50 Flu 11:00 AM 1.00 Physical 10:30 AM 1.00 Physical 9:00 AM 0.50 Migraine Patient ID 466927 456789 194756 329657 987453 384788 438754 345875 466927 345875 First Nm Lisa Sue Brandon Marcus Mike Tonya Iliana Carla Lisa Carla Last Nm Garcia Carey Pierre Schwartz Jones Johnson Hnatt Basich Garcia Basich Phone 562-3456 432-1234 432-7877 239-5502 456-0202 432-8806 823-4303 857-5566 562-3456 857-5666 Doctor ID C678 A528 S626 A528 G123 S626 C678 A528 C678 C678 Doctor Nm Chapman Lopez Smith Lopez Gray Smith Chapman Lopez Chapman Chapman It’s already in 1NF, and 2NF …but you need to understand why! Convert this to 3NF. How many tables do you now have? 22