Normal Forms

advertisement
Objectives
 In this lesson, you will learn to:
 Describe data redundancy
 Describe the first, second, and third
normal forms
 Describe the Boyce-Codd Normal Form
(BCNF)
 Appreciate the need for denormalization
RDBMS Concepts/ Session 3 / 1 of 22
Normalization
 The logical design of the database, including
the tables and the relationships between
them, is the core of an optimized relational
database.
 A good logical database design can lay the
foundation for optimal database and
application performance. A poor logical
database design can impair the performance
of the entire system.
RDBMS Concepts/ Session 3 / 2 of 22
 Normalizing a logical database design
involves using formal methods to
separate the data into multiple, related
tables.
 A greater number of narrow tables (with
fewer columns) is characteristic of a
normalized database. A few wide tables
(with more columns) is characteristic of
an nonnomalized database.
RDBMS Concepts/ Session 3 / 3 of 22
Understanding Data Redundancy
 Redundancy means repetition of data
 Redundancy increases the time involved
in updating, adding, and deleting data
 It also increases the utilization of disk
space and hence, disk I/O increases
RDBMS Concepts/ Session 3 / 4 of 22
Understanding Data Redundancy
(Contd.)
 Redundancy can lead to the following
problems:
 Update anomalies—Inserting, modifying,
and deleting data may cause
inconsistencies
 Inconsistencies—Errors are more likely to
occur when facts are repeated
 Unnecessary utilization of extra disk
space
RDBMS Concepts/ Session 3 / 5 of 22
Definition of Normalization
 Normalization is a scientific method of breaking
down complex table structures into simple table
structures by using certain rules
 It allows you to reduce redundancy in a table
and eliminate the problems of inconsistency
and disk space usage
 Normalization results in the formation of tables
that satisfy certain specified rules and represent
certain normal forms
RDBMS Concepts/ Session 3 / 6 of 22
Normal Forms
 The most important and widely used
normal forms are:
 First Normal Form (1 NF)
 Second Normal Form (2 NF)
 Third Normal Form (3 NF)
 Boyce Codd Normal Form (BCNF)
RDBMS Concepts/ Session 3 / 7 of 22
First Normal Form
 A table is said to be in the 1 NF when each cell
of the table contains precisely one value
 Functional Dependency
 The normalization theory is based on the
fundamental notion of functional dependency
 Given a relation R, attribute A is functionally
dependent on attribute B if each value of A in R
is associated with precisely one value of B
RDBMS Concepts/ Session 3 / 8 of 22
Un-Normalised Data









Employee No
Employee Name
Branch Code
Branch Name
Branch Location
Certification ID 1….n
Certification Name 1….n
Certification done at
Marks obtained
RDBMS Concepts/ Session 3 / 9 of 22
Rule 1
 Eliminate repeating groups:
 Make a separate table for each set of
repeated attributes and give each table
a primary key.
RDBMS Concepts/ Session 3 / 10 of 22
FNF





Employee No
Employee Name
Branch Code
Branch Name
Branch Location





Employee No
Certification ID
Certification Name
Certification done at
Marks obtained
RDBMS Concepts/ Session 3 / 11 of 22
Second Normal Form (2NF)
 A table is said to be in 2 NF when it is in 1 NF
and every attribute in the row is functionally
dependent upon the whole key, and not just
part of the key
 To ensure that a table is in 2 NF, you should:
 Find and remove attributes that are functionally
dependent on only a part of the key and not on
the whole key and place them in a different table
 Group the remaining attributes
RDBMS Concepts/ Session 3 / 12 of 22
Rule 2
 Eliminate Redundant Data
 If an attribute depends only on part of a
multi-valued key, move it to separate
table.
 The certification Name appears
redundantly.(It also depends only on a
part of the multi-valued key).
RDBMS Concepts/ Session 3 / 13 of 22
SNF
Employee
Employee No
Employee
Name
Branch Code
Branch Name
Branch
Location
Certifications
Emp
Certifications
Certification ID Employee No
Certification
Certification ID
Name
Certification
done at
Marks
obtained
RDBMS Concepts/ Session 3 / 14 of 22
Third Normal Form (3NF)
 A relation is said to be in 3 NF when it is in 2
NF and every non-key attribute is functionally
dependent only on the primary key
 To ensure that a table is in 3 NF, you should:
 Find and remove non-key attributes that are
functionally dependent on attributes that are not
the primary key and place them in a different
table
 Group the remaining attributes
RDBMS Concepts/ Session 3 / 15 of 22
Rule 3
 Eliminate columns not dependent on
Key
 Employee Table satisfies 1st & 2nd normal
forms.
 But the key is Employee No, and the
Branch name & location describe only a
branch, Not a employee.
RDBMS Concepts/ Session 3 / 16 of 22
TNF
 Employee
 Employee No
 Name
 Branch Code
 Branch
 Branch Code
 Branch Name
 Location
 Certification
 Cert. ID
 Cert. Name
 Emp Certification




Emp No
Cert Id
Cert. Done at
Marks obtained
RDBMS Concepts/ Session 3 / 17 of 22
Boyce-Codd Normal Form
 The original definition of 3NF was inadequate in
some situations
 It was not satisfactory for the tables:
 that had multiple candidate keys
 where the multiple candidate keys were composite
 where the multiple candidate keys overlapped
 Therefore, a new normal form—the BoyceCodd Normal Form (BCNF) was introduced
 A relation is in the Boyce-Codd normal form
(BCNF) if and only if every determinant is a
candidate key
RDBMS Concepts/ Session 3 / 18 of 22
Characteristics of a normalized database




Each table must have a key field.
All field must contain small data.
There must be no repeating fields.
Each table must contain information
about a single entity.
 Each field in a table must depend on the
key field.
 All non-key fields must be mutually
independent.
RDBMS Concepts/ Session 3 / 19 of 22
Understanding Denormalization
 The end product of normalization is a set of
related tables that comprise the database
 However, in the interests of speed of response
to critical queries, which demand information
from more than one table, it is sometimes wiser
to introduce a degree of redundancy in tables
 The intentional introduction of redundancy in a
table to improve performance is called
denormalization
RDBMS Concepts/ Session 3 / 20 of 22
Summary
In this lesson, you learned that:
 Normalization is used to simplify table
structures.
 Normalization results in the formation of tables
that satisfy certain specified constraints, and
represent certain normal forms. The normal
forms are used to ensure that various types of
anomalies and inconsistencies are not
introduced in the database. A table structure is
always in a certain normal form. Several normal
forms have been identified.
RDBMS Concepts/ Session 3 / 21 of 22
Summary (Contd.)
 The most important and widely used of these
are:




First Normal Form (1NF)
Second Normal Form (2 NF)
Third Normal Form (3 NF)
Boyce Codd Normal Form (BCNF)
 The intentional introduction of redundancy in a
table in order to improve performance is called
denormalization.
 The decision to denormalize results in a tradeoff between performance and data integrity.
 Denormalization increases disk space
utilization.
RDBMS Concepts/ Session 3 / 22 of 22
Download