Relation Normalization (Chapter 14) Database Management COP4540, SCS, FIU Modification Anomalies • What are modification anomalies? – Errors or inconsistencies that may result when a user attempts to update a relation. • Types of anomalies. – Insertion anomalies. • An independent piece of information cannot be recorded into a relation unless an irrelevant information must be inserted together at the same time – Update anomalies. • The update of a piece of information must occur at multiple locations, not required by the referential integrity rule. – Deletion anomalies • The deletion of a piece of information unintentionally removes Database Management other information. COP4540, SCS, FIU Normal Forms • Normal forms are classes of relations and the techniques for preventing anomalies. • Normal forms are classified by the type of modification anomalies that have been removed. • Types of normal forms: – – – – – – – First Normal Form (1NF). Second Normal Form (2NF). Third Normal Form (3NF). Boyce-Codd Normal Form (BCNF). Fourth Normal Form (4NF). Fifth Normal Form (5NF). Domain/Key Normal Form (DK/NF). Database Management COP4540, SCS, FIU First Normal Form (1NF) • A relation R is in 1NF if and only if all attribute domains contain atomic values only. • Any table meets the definition of a relation is said to be in first Normal form, i.e. a relation in relational schema is always in 1NF. Database Management COP4540, SCS, FIU Second Normal Form (2NF) • A relation is in 2NF if and only if it is in 1NF without partial dependencies. • Partial dependency – A dependency in which one or more non-key attributes are functionally dependent on part (but not all) of the key. • Two extreme cases – The primary key consists of only one attribute. – No non-key attributes exist in the relation. Database Management COP4540, SCS, FIU Third Normal Form • A relation R is in 3NF if: whenever A1 A2 … AnB is a nontrivial dependency, either {A1, …,An} is a superkey, or B is a member of some key. • A relation is in 3NF if: – it is in 2NF without transitive dependencies. – Transitive dependency • A functional dependency between two (or more) non-key attributes. Database Management COP4540, SCS, FIU Example SALES(Cust_ID, Name, Salesperson, Region) FDs: Cust_ID Cust_ID Name Salesperson Region Salesperson Region Cust_ID 8023 9167 7924 6837 8596 7018 Name Anderson Bancroft Hobbs Tucker Eckersley Arnold Salesperson Smith Hicks Smith Hernandez Hicks Faulb Region South West South East West North Database Management COP4540, SCS, FIU Example SALES1(Cust_ID, Name, Salesperson) SPERSON(Selesperson, Region) Cust_ID 8023 9167 7924 6837 8596 7018 Name Anderson Bancroft Hobbs Tucker Eckersley Arnold Salesperson Smith Hicks Smith Hernandez Hicks Faulb Salesperson Smith Hicks Hernandez Faulb Region South West East North Database Management COP4540, SCS, FIU Another Example SHIPMENT(Snum, Origin, Destination Distance) FDs: Snum Snum Origin Destination Distance Origin Destination Distance SHIPMENT(Snum, Origin, Destination) DISTANCE(Origin Destination Distance) Database Management COP4540, SCS, FIU Relation Normalization Question SID S1 S1 S2 S3 FDs: Name Joseph Joseph Alice Tom CID Grade Text Major Dept CIS01 A B1 CS CIS CIS02 B B2 CS CIS CIS01 B B1 IS CIS EEE01 A B3 EE EEE SID CID Grade 1. What can be the primary key for above relation? SID Name Major 2. Decompose the above relation into 2NF then CID Text 3NF relations. Major Dept Database Management COP4540, SCS, FIU Boyce-Codd Normal Form • A relation R is in BCNF if and only if: whenever nontrivial dependency A1 A2 … AnB1 B2 … Bm holds for R, it must be the case that {A1, A2, …, An} is a superkey for R. • BCNF is one of the most important normal forms. – Relations in BCNF have no anomaly in regards to functional dependencies. Database Management COP4540, SCS, FIU 1. Each students may major in several subjects. 2. For each major, a given student has only one adviser. 3. Each major has several advisors. 4. Each advisor advises only one major. 5. Each advisor advises several students in one major. StuId Major Fname; Fname Major StuId S1 S1 S2 S3 S4 Major Fname Physics Einstein Music Mozart Biology Darwin Physics Bohr Physics Einstein StuId S1 S1 S2 S3 S4 Fname Einstein Mozart Darwin Bohr Einstein Fname Einstein Mozart Darwin Bohr Database Management COP4540, SCS, FIU Major Physics Music Biology Physics Decomposition into BCNF 1. Set D = {R}; 2. While there is a relation Q in D that is not in BCNF do begin choose a relation Q in D that is not in BCNF; find a FD XY in Q that violates BCNF; Expand right side to include X+; replace Q in D by two relations (Q-X+) X and X+; end; Database Management COP4540, SCS, FIU Remove transitive dependencies Remove non-atomic attributes Tables 1NF 2NF Remove partial dependencies 3NF BCNF Remove remaining anomalies resulting from functional dependencies Database Management COP4540, SCS, FIU