Uploaded by John Appiah

Unit 1 Database Management II

advertisement
INF 216D.EBS 416D
DATABASE
MANAGEMENT II
UNIT 1: NORMALIZATION
SESSIONS
1
First Normal Form (1NF)
2
Second Normal Form (2NF)
3
Third Normal Form (3NF)
4
Boys Codd Normal Form (BCNF)
5
Fourth Normal Form (4NF)
6
Fifth Normal Form (5NF)
INTRODUCTION
Normalization
entails
building
tables
and
connecting those tables in accordance with rules
intended to safeguard the data and increase the
database's flexibility by removing duplication and
inconsistent reliance
Additionally, normalization is a method of database
architecture that lessens data redundancy and gets
rid of undesired traits like Insertion, Update, and
Deletion
Anomalies.
NORMALIZATION
3
2023
THE FIRST NORMAL
FORM (1NF)
NORMALIZATION
1NF
oThe First Normal Form (1NF) is the first step in the
normalization process of a database.
oIt is a property of a database table and helps to
eliminate data redundancy and improve data
integrity.
NORMALIZATION
5
2023
1NF RULES
o For a relation to be in the First Normal Form, it has to necessarily follow these
rules below.
o It should have one single(atomic) valued attribute/column.
o This explains that every column in the table should contain a single
value which means they cannot contain multiple values.
o Records stored in every column must be of the same domain.
o In the column, the values stored must be of the same type or kind. For
example, a column designed to store names must only store names
and not any other thing such as date of birth or age.
o All the table columns should have unique names.
o There should not be an instant where two columns in a relation should
have the same name.
NORMALIZATION
6
2023
1NF EXAMPLE
o From the table above, 2 of the students have opted for more than 1 subject.
o And these subjects have been stored in a single column which is against the
1NF rule: thus, a column must contain a single entity of value.
o Note that this is similar to the student_name columns.
o To rectify these issues, the student column must be split into two to contain
the first name and last name. and the subject column must be designed to
contain only a single subject
NORMALIZATION
7
2023
1NF EXAMPLE (CON…)
NORMALIZATION
8
2023
THE SECOND NORMAL
FORM (2NF)
NORMALIZATION
THE SECOND NORMAL FORM
(2NF) IS A RELATIONAL
DATABASE PROPERTY USED
TO ENSURE DATA INTEGRITY
AND CONSISTENCY
NORMALIZATION
2NF RULES
oFor a relation to qualify for a 2nd Normal Form, it must
satisfy the following constraints.
oThe first is it must be in the First Normal Form (1FN)
thus it must contain a single value which means that the
table has a primary key and the values in each column
of the table must be atomic (i.e., indivisible).
oAlso, there should not be a Partial Dependency.
oIt must not contain any redundant data, which means
that data should not be duplicated in multiple places.
NORMALIZATION
11
2023
PARTIAL DEPENDENCY
oPartial Dependency is a concept in database normalization that
refers to the dependency of a non-key attribute on only part of a
candidate key.
oIn other words, partial dependency occurs when an attribute
depends on only part of the primary key and not the entire primary
key.
oFor example, consider a table with columns: "student_id",
"student_name", "course_id", and "grade".
oThe candidate key is composed of both "student_id" and
"course_id".
oHowever, if the "student_name" attribute depends only on the
"student_id" and not on the "course_id",
then we have a case of
NORMALIZATION
2023
12
partial dependency
2NF EXAMPLE
o The studentName can be identified by the student studentId.
o Similarly, the CourseId can also be determined by the Grade, both cases make the relation
partially dependent.
o In the best case of a 2NF design, it is recommended that every relation is identified by a single
key.
o To achieve 2NF, on the table above, it may be necessary to split the relational table into multiple
smaller tables, each of which represents a specific aspect of the data.
o This process is known as normalization.
o There could be several ways to deal with situations like this but the straightforward and easy way
is to remove the CourseId from the table.
NORMALIZATION
13
2023
2NF EXAMPLE (CON…)
From the table above, none of the columns is partially dependent.
Notice that, the studentName and Grade are totally dependent on
the studentId. In other instance, both the CourseId and the Grade
column could be removed to form a new relation.
NORMALIZATION
14
2023
THE THIRD NORMAL
FORM (3NF)
NORMALIZATION
THE THIRD NORMAL FORM
(3NF) WAS FIRST DESCRIBED
BY E. F. CODD IN HIS 1970
PAPER "A RELATIONAL MODEL
OF DATA FOR LARGE SHARED
DATA BANKS"
NORMALIZATION
3NF
oThe Third Normal Form (3NF) is the next level of normalization in a
relational database, building upon the First Normal Form (1NF)
and Second Normal Form (2NF).
oThe third normal form (3NF) is a level of database normalization
that aims to eliminate transitive dependencies in a relation.
oA table is in the Third Normal Form when it is in the Second
Normal Form and has no transitive dependencies.
NORMALIZATION
17
2023
TRANSITIVE DEPENDENCY
oTransitive Dependency is the term used to describe when an
indirect interaction results in functional dependency.
oA transitive attribute can be defined as a situation where an
attribute is dependent on another attribute that is not prime rather
than the prime attribute or primary key.
oThis means that non-key attributes should not be dependent on
other non-key attributes.
oThe goal of 3NF is to eliminate redundancy in a database and
improve data integrity by ensuring that each non-key attribute
depends only on the primary key, not on other non-key attributes
NORMALIZATION
18
2023
3NF EXAMPLE
o exam_name is just another column in the Score table.
o It is neither a primary key nor even part of the primary key and column total_marks depend
on it.
o This situation is called Transitive dependency.
o To remove transitive dependency from a relation we split the table into two with the
appropriate columns on each table. And then include a primary key in that table and
reference it as a foreign key in the original table
o The other way the transitive dependency can be removed is by spitting that table into two
relations with each of them having the appropriate columns.
o A bridge table is then created to combine both relations. What the bridge table does is
combine
two tables after spotting a transitive dependency
NORMALIZATION
19
2023
3NF EXAMPLE
NORMALIZATION
20
2023
IMPORTANCE OF REMOVING THE
TRANSITIVE DEPENDENCY
oIt reduces the number of duplicate data
oIt enforces data integrity i.e., Data accuracy
and consistency is achieved
oBy achieving 3NF, we can eliminate transitive
dependencies and further reduce the
chances of data anomalies, making the data
even more consistent and reliable.
NORMALIZATION
21
2023
BOYCE-CODD NORMAL
FORM (BCNF)
NORMALIZATION
THE BOYCE-CODD NORMAL FORM
(BCNF) WAS FIRST DESCRIBED BY
RAYMOND F. BOYCE AND EDGAR
F. CODD IN THEIR 1974 PAPER "A
NORMAL FORM FOR RELATIONAL
DATABASES."
NORMALIZATION
BCNF
o Boyce and Codd expanded on the work of Codd, who had introduced the concept
of relational databases and the principles of database normalization in a 1970
paper.
o The Third Normal Form (3NF), which Codd developed in his earlier study, had
significant drawbacks, which Boyce and Codd's work addressed.
o They acknowledged that 3NF was insufficient for some databases, especially
those with specific functional requirements, and suggested using BCNF as a
higher level of normalization.
o Boyce-Codd Normal Form is an extension to the third normal form and can also
be called the 3.5 Normal Form.
o The BCNF form may not always retain functional dependence.
o In such a scenario, only opt for BCNF if replacing the missing Functional
Dependencies FD(s) is not necessary; otherwise, only normalize up to the Third
Normal Form (3NF).
NORMALIZATION
24
2023
BCNF RULES
o For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two
conditions:
o It should be in the Third Normal Form.
o And, for any dependency X → Y, X should be a super key or prime key.
o The latter point sounds a bit tricky, but in simple words, it means, that for a
dependency X → Y, X cannot be a non-prime attribute, if Y is a prime attribute.
o Achieving BCNF requires careful analysis of the functional dependencies in a
database, it may be necessary to break down a table into multiple smaller tables and
establish relationships between them through the use of foreign keys.
o This process helps to eliminate transitive dependencies and make the data even more
consistent and reliable
NORMALIZATION
25
2023
BCNF EXAMPLE
o Looking at the table above, a student can enroll in more than one course
o A student has the option to register for more than one course.
o Example student_id 10 has registered for Python and Database Management.
o Also for each course, a separate lecturer is assigned to handle it moreover, there can
be multiple lecturers teaching the same course.
o An example is the Python course.
o What would probably be the primary key of this table?
o student_id + subject together will obviously form the PK because using both can be
used to find all the columns or records in the table.
NORMALIZATION
26
2023
BCNF EXAMPLE (CON…)
o This table upon careful evaluation satisfies all the normalization s forms from the first
to the third, except the Boyce-Codd Normal Form
o We previously noticed that the primary key will be both subject_id + subject together
which literally means the Subject column is a prime attribute.
o But then we could again notice another dependency between the columns, lecturer,
and subject.
o Since the subject is a prime attribute, the lecturer clearly is a non-prime attribute,
which is not permitted by the constraint placed on the BCNF.
NORMALIZATION
27
2023
BCNF EXAMPLE (CON…)
NORMALIZATION
28
2023
Now let’s covert the above table into a BCNF table structure.
We will have to split the College Enrollment table into two tables,
the student table, and the lecturer’s table.
Each of the tables will have a primary key and then the primary
key of the lectures table will be referenced as a foreign key in the
student table
BCNF
EXAMPLE
(CON…)
Presentation title
29
20XX
FOURTH NORMAL
FORM (4NF)
NORMALIZATION
John Akwasi Appiah
College of Distance Education
University of Cape Coast
THANK
2023
YOU
Download