Uploaded by Mohamed Sadiq

Normalization

advertisement
Normalization
Introduction
▪ A well-structured relation contains minimal
redundancy and allows users to perform insert, modify
and delete operations in a relation without causing
issue.
▪ Redundancies in a table may result not only in wasted
space, but also lead to loss of data integrity in the
database.
2
Table with Repeating Groups
3
Anomalies in Databases
▪ An Anomaly is an error or inconsistency that may result
when a user attempts to update a relation that contains
redundant data.
▪ There are three types of anomalies
▪ Insertion anomaly
▪ Update anomaly
▪ Deletion anomaly
4
Insertion
Anomaly
▪ Inability to add data to the database due to absence of
related data
▪ Eg: Not being able to enter the registration details of a
student, because that particular course does not yet
exist.
▪ So, if we had a composite key of StudNo, CourseID –
the data cannot be added IF the CourseID does not
exist
5
Update Anomaly
▪ Happens when a change in redundant data makes
the data inconsistent
▪ Eg: if a student has multiple registrations, and each
registration also includes the address, we would have to
update each address if there is a change.
6
Deletion Anomaly
▪ Unintended loss of data due to deletion of related data
▪ Eg: if a student is registered for a course, and the
course name and course ID are stored in that same
relation, if you delete the student, you lose the course
details too
7
Normalisation Process
▪ Normalisation is a formal process for deciding which
attributes should be grouped together in a relation.
▪ Normalisation is the process of decomposing relations
with anomalies to produce smaller, well-structured
relations.
▪ Normal forms are the rules used for structuring relations
8
Table with repeating groups
• Remove Repeating
1st Normal Form (1NF)
• Remove Partial Dependencies
2nd Normal Form (2NF)
• Remove Transitive Dependencies
3rd Normal Form (3NF)
• Make Every Determinant as a Key
Boyce Codd Normal Form (BCNF)
4th Normal Form (4NF)
• Remove Multivalued Dependencies
• Remove Join Dependencies
5th Normal Form (5NF)
9
First Normal Form
(1NF)
▪ A relation is in 1NF, if it does not contain repeating groups
or multivalued attributes.
▪ In plain English: Each record needs to be unique and
Each table cell should contain a single value
▪ This can be achieved by separating table into two tables
1. A table containing single valued attributes with a key
▪ Project_1(ProjNo, ProjName)
2. A table containing multivalued attributes with a composite
key
▪ Works_1 (ProjNo, EmpNo, EmpName, JobTitle, HourlyRate,
HrsWorked)
10
Before Normalisation
Table with Repeating Groups
11
After Applying 1st Normal Form
Composite Key
12
Functional Dependency
▪ A value of an attribute in a tuple can determines a value
of other attributes in the same tuple.
▪ Eg:
▪ A,B,C,D are attributes in a relation called R.
▪ R (A,B,C,D)
▪ B,D are functionally dependent on A.
▪ A →B,D
▪ Example coming up, after explaining determinant / dependent
13
A →B,D
▪ Determinant
▪ An attribute or attributes on the left hand side of the
functional dependency, which determines the values of other
attributes in the same tuple.
▪ Dependent
▪ An attribute or attributes on the right hand side of the
functional dependency that depends on determinant.
14
Example for Functional Dependency
▪ In this table, if we know the
EmpNo, then we can find
the EmpName, JobTitle,
HourlyRate
▪ Therefore, those 3
attributes are functionally
dependant on EmpNo
15
Second Normal Form (2NF)
▪ Relation must be in 1NF
▪ AND
▪ No Partial Dependencies exist (Every non-key attribute
is fully functionally dependent on Key attribute).
Partial Dependency: Non-key attribute functionally
depends on just a part of the key attribute
▪ To achieve 2NF, identify the partial dependencies of
table in 1NF, split the table into a set of relations where
each relation is having a unique identifier
16
▪ Functional Dependencies (in Works_1)
▪ EmpNo → EmpName, JobTitle, HourlyRate
▪ ProjNo,EmpNo → HrsWorked
▪ Relations in 2NF
▪ Employee_2 (EmpNo, EmpName, JobTitle, HourlyRate)
▪ Works_2 (EmpNo, ProjNo , HrsWorked)
▪ Project_2 (ProjNo, ProjName)
17
Before Applying 2nd Normal Form
Partial Dependency
18
After Applying 2nd
Normal Form
Full Dependency on
the Key
19
Third Normal Form
(3NF)
▪ A relation is in 3NF, if the relation is in 2NF & no
transitive dependencies exist
▪ Transitive Dependency: Non-key attribute is
functionally dependent on another Non-key attribute
20
Third Normal Form
(3NF)
▪ In Plain English:
▪ changing a non-key column, might cause any of the other
non-key columns to change
▪ Eg
▪ Changing a name, may affect a title or designation
▪ To achieve 3NF, identify the transitive dependencies of
table in 2NF. Based on them, split the table into a set of
relations where each relation is having a unique
identifier
21
▪ Dependencies (in Employee_2)
▪ JobTitle → HourlyRate
▪ EmpNo → EmpName, JobTitle
▪ Relations in 3NF
▪ Job_3 (JobTitle, HourlyRate)
▪ Employee_3 (EmpNo, EmpName, JobTitle)
▪ Works_3 (EmpNo, ProjNo, HrsWorked)
▪ Project_3 (ProjNo, ProjName)
22
Before Applying 3rd
Normal Form
Transitive Dependency
23
After Applying 3rd Normal Form
24
Download