INF 216D.EBS 416D DATABASE MANAGEMENT II UNIT 1: NORMALIZATION SESSIONS 1 First Normal Form (1NF) 2 Second Normal Form (2NF) 3 Third Normal Form (3NF) 4 Boys Codd Normal Form (BCNF) 5 Fourth Normal Form (4NF) 6 Fifth Normal Form (5NF) INTRODUCTION Normalization entails building tables and connecting those tables in accordance with rules intended to safeguard the data and increase the database's flexibility by removing duplication and inconsistent reliance Additionally, normalization is a method of database architecture that lessens data redundancy and gets rid of undesired traits like Insertion, Update, and Deletion Anomalies. NORMALIZATION 3 2023 THE FIRST NORMAL FORM (1NF) NORMALIZATION 1NF oThe First Normal Form (1NF) is the first step in the normalization process of a database. oIt is a property of a database table and helps to eliminate data redundancy and improve data integrity. NORMALIZATION 5 2023 1NF RULES o For a relation to be in the First Normal Form, it has to necessarily follow these rules below. o It should have one single(atomic) valued attribute/column. o This explains that every column in the table should contain a single value which means they cannot contain multiple values. o Records stored in every column must be of the same domain. o In the column, the values stored must be of the same type or kind. For example, a column designed to store names must only store names and not any other thing such as date of birth or age. o All the table columns should have unique names. o There should not be an instant where two columns in a relation should have the same name. NORMALIZATION 6 2023 1NF EXAMPLE o From the table above, 2 of the students have opted for more than 1 subject. o And these subjects have been stored in a single column which is against the 1NF rule: thus, a column must contain a single entity of value. o Note that this is similar to the student_name columns. o To rectify these issues, the student column must be split into two to contain the first name and last name. and the subject column must be designed to contain only a single subject NORMALIZATION 7 2023 1NF EXAMPLE (CON…) NORMALIZATION 8 2023 THE SECOND NORMAL FORM (2NF) NORMALIZATION THE SECOND NORMAL FORM (2NF) IS A RELATIONAL DATABASE PROPERTY USED TO ENSURE DATA INTEGRITY AND CONSISTENCY NORMALIZATION 2NF RULES oFor a relation to qualify for a 2nd Normal Form, it must satisfy the following constraints. oThe first is it must be in the First Normal Form (1FN) thus it must contain a single value which means that the table has a primary key and the values in each column of the table must be atomic (i.e., indivisible). oAlso, there should not be a Partial Dependency. oIt must not contain any redundant data, which means that data should not be duplicated in multiple places. NORMALIZATION 11 2023 PARTIAL DEPENDENCY oPartial Dependency is a concept in database normalization that refers to the dependency of a non-key attribute on only part of a candidate key. oIn other words, partial dependency occurs when an attribute depends on only part of the primary key and not the entire primary key. oFor example, consider a table with columns: "student_id", "student_name", "course_id", and "grade". oThe candidate key is composed of both "student_id" and "course_id". oHowever, if the "student_name" attribute depends only on the "student_id" and not on the "course_id", then we have a case of NORMALIZATION 2023 12 partial dependency 2NF EXAMPLE o The studentName can be identified by the student studentId. o Similarly, the CourseId can also be determined by the Grade, both cases make the relation partially dependent. o In the best case of a 2NF design, it is recommended that every relation is identified by a single key. o To achieve 2NF, on the table above, it may be necessary to split the relational table into multiple smaller tables, each of which represents a specific aspect of the data. o This process is known as normalization. o There could be several ways to deal with situations like this but the straightforward and easy way is to remove the CourseId from the table. NORMALIZATION 13 2023 2NF EXAMPLE (CON…) From the table above, none of the columns is partially dependent. Notice that, the studentName and Grade are totally dependent on the studentId. In other instance, both the CourseId and the Grade column could be removed to form a new relation. NORMALIZATION 14 2023 THE THIRD NORMAL FORM (3NF) NORMALIZATION THE THIRD NORMAL FORM (3NF) WAS FIRST DESCRIBED BY E. F. CODD IN HIS 1970 PAPER "A RELATIONAL MODEL OF DATA FOR LARGE SHARED DATA BANKS" NORMALIZATION 3NF oThe Third Normal Form (3NF) is the next level of normalization in a relational database, building upon the First Normal Form (1NF) and Second Normal Form (2NF). oThe third normal form (3NF) is a level of database normalization that aims to eliminate transitive dependencies in a relation. oA table is in the Third Normal Form when it is in the Second Normal Form and has no transitive dependencies. NORMALIZATION 17 2023 TRANSITIVE DEPENDENCY oTransitive Dependency is the term used to describe when an indirect interaction results in functional dependency. oA transitive attribute can be defined as a situation where an attribute is dependent on another attribute that is not prime rather than the prime attribute or primary key. oThis means that non-key attributes should not be dependent on other non-key attributes. oThe goal of 3NF is to eliminate redundancy in a database and improve data integrity by ensuring that each non-key attribute depends only on the primary key, not on other non-key attributes NORMALIZATION 18 2023 3NF EXAMPLE o exam_name is just another column in the Score table. o It is neither a primary key nor even part of the primary key and column total_marks depend on it. o This situation is called Transitive dependency. o To remove transitive dependency from a relation we split the table into two with the appropriate columns on each table. And then include a primary key in that table and reference it as a foreign key in the original table o The other way the transitive dependency can be removed is by spitting that table into two relations with each of them having the appropriate columns. o A bridge table is then created to combine both relations. What the bridge table does is combine two tables after spotting a transitive dependency NORMALIZATION 19 2023 3NF EXAMPLE NORMALIZATION 20 2023 IMPORTANCE OF REMOVING THE TRANSITIVE DEPENDENCY oIt reduces the number of duplicate data oIt enforces data integrity i.e., Data accuracy and consistency is achieved oBy achieving 3NF, we can eliminate transitive dependencies and further reduce the chances of data anomalies, making the data even more consistent and reliable. NORMALIZATION 21 2023 BOYCE-CODD NORMAL FORM (BCNF) NORMALIZATION THE BOYCE-CODD NORMAL FORM (BCNF) WAS FIRST DESCRIBED BY RAYMOND F. BOYCE AND EDGAR F. CODD IN THEIR 1974 PAPER "A NORMAL FORM FOR RELATIONAL DATABASES." NORMALIZATION BCNF o Boyce and Codd expanded on the work of Codd, who had introduced the concept of relational databases and the principles of database normalization in a 1970 paper. o The Third Normal Form (3NF), which Codd developed in his earlier study, had significant drawbacks, which Boyce and Codd's work addressed. o They acknowledged that 3NF was insufficient for some databases, especially those with specific functional requirements, and suggested using BCNF as a higher level of normalization. o Boyce-Codd Normal Form is an extension to the third normal form and can also be called the 3.5 Normal Form. o The BCNF form may not always retain functional dependence. o In such a scenario, only opt for BCNF if replacing the missing Functional Dependencies FD(s) is not necessary; otherwise, only normalize up to the Third Normal Form (3NF). NORMALIZATION 24 2023 BCNF RULES o For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions: o It should be in the Third Normal Form. o And, for any dependency X → Y, X should be a super key or prime key. o The latter point sounds a bit tricky, but in simple words, it means, that for a dependency X → Y, X cannot be a non-prime attribute, if Y is a prime attribute. o Achieving BCNF requires careful analysis of the functional dependencies in a database, it may be necessary to break down a table into multiple smaller tables and establish relationships between them through the use of foreign keys. o This process helps to eliminate transitive dependencies and make the data even more consistent and reliable NORMALIZATION 25 2023 BCNF EXAMPLE o Looking at the table above, a student can enroll in more than one course o A student has the option to register for more than one course. o Example student_id 10 has registered for Python and Database Management. o Also for each course, a separate lecturer is assigned to handle it moreover, there can be multiple lecturers teaching the same course. o An example is the Python course. o What would probably be the primary key of this table? o student_id + subject together will obviously form the PK because using both can be used to find all the columns or records in the table. NORMALIZATION 26 2023 BCNF EXAMPLE (CON…) o This table upon careful evaluation satisfies all the normalization s forms from the first to the third, except the Boyce-Codd Normal Form o We previously noticed that the primary key will be both subject_id + subject together which literally means the Subject column is a prime attribute. o But then we could again notice another dependency between the columns, lecturer, and subject. o Since the subject is a prime attribute, the lecturer clearly is a non-prime attribute, which is not permitted by the constraint placed on the BCNF. NORMALIZATION 27 2023 BCNF EXAMPLE (CON…) NORMALIZATION 28 2023 Now let’s covert the above table into a BCNF table structure. We will have to split the College Enrollment table into two tables, the student table, and the lecturer’s table. Each of the tables will have a primary key and then the primary key of the lectures table will be referenced as a foreign key in the student table BCNF EXAMPLE (CON…) Presentation title 29 20XX FOURTH NORMAL FORM (4NF) NORMALIZATION John Akwasi Appiah College of Distance Education University of Cape Coast THANK 2023 YOU