Relation Normalization

advertisement
Relation Normalization
(Chapter 14)
Database Management
COP4540, SCS, FIU
Modification Anomalies
• What are modification anomalies?
– Errors or inconsistencies that may result when a user
attempts to update a relation.
• Types of anomalies.
– Insertion anomalies.
• An independent piece of information cannot be recorded into a
relation unless an irrelevant information must be inserted
together at the same time
– Update anomalies.
• The update of a piece of information must occur at multiple
locations, not required by the referential integrity rule.
– Deletion anomalies
• The deletion of a piece of information unintentionally removes
Database Management
other information.
COP4540, SCS, FIU
Normal Forms
• Normal forms are classes of relations and the techniques
for preventing anomalies.
• Normal forms are classified by the type of modification
anomalies that have been removed.
• Types of normal forms:
–
–
–
–
–
–
–
First Normal Form (1NF).
Second Normal Form (2NF).
Third Normal Form (3NF).
Boyce-Codd Normal Form (BCNF).
Fourth Normal Form (4NF).
Fifth Normal Form (5NF).
Domain/Key Normal Form (DK/NF).
Database Management
COP4540, SCS, FIU
First Normal Form (1NF)
• A relation R is in 1NF if and only if all
attribute domains contain atomic values
only.
• Any table meets the definition of a relation
is said to be in first Normal form, i.e. a
relation in relational schema is always in
1NF.
Database Management
COP4540, SCS, FIU
Second Normal Form (2NF)
• A relation is in 2NF if and only if it is in
1NF without partial dependencies.
• Partial dependency
– A dependency in which one or more non-key
attributes are functionally dependent on part
(but not all) of the key.
• Two extreme cases
– The primary key consists of only one attribute.
– No non-key attributes exist in the relation.
Database Management
COP4540, SCS, FIU
Third Normal Form
• A relation R is in 3NF if: whenever A1 A2
… AnB is a nontrivial dependency, either
{A1, …,An} is a superkey, or B is a member
of some key.
• A relation is in 3NF if:
– it is in 2NF without transitive dependencies.
– Transitive dependency
• A functional dependency between two (or more)
non-key attributes.
Database Management
COP4540, SCS, FIU
Example
SALES(Cust_ID, Name, Salesperson, Region)
FDs:
Cust_ID  Cust_ID Name Salesperson Region
Salesperson  Region
Cust_ID
8023
9167
7924
6837
8596
7018
Name
Anderson
Bancroft
Hobbs
Tucker
Eckersley
Arnold
Salesperson
Smith
Hicks
Smith
Hernandez
Hicks
Faulb
Region
South
West
South
East
West
North
Database Management
COP4540, SCS, FIU
Example
SALES1(Cust_ID, Name, Salesperson)
SPERSON(Selesperson, Region)
Cust_ID
8023
9167
7924
6837
8596
7018
Name
Anderson
Bancroft
Hobbs
Tucker
Eckersley
Arnold
Salesperson
Smith
Hicks
Smith
Hernandez
Hicks
Faulb
Salesperson
Smith
Hicks
Hernandez
Faulb
Region
South
West
East
North
Database Management
COP4540, SCS, FIU
Another Example
SHIPMENT(Snum, Origin, Destination Distance)
FDs:
Snum  Snum Origin Destination Distance
Origin Destination  Distance
SHIPMENT(Snum, Origin, Destination)
DISTANCE(Origin Destination Distance)
Database Management
COP4540, SCS, FIU
Relation Normalization Question
SID
S1
S1
S2
S3
FDs:
Name
Joseph
Joseph
Alice
Tom
CID Grade Text Major Dept
CIS01
A
B1
CS CIS
CIS02
B
B2
CS CIS
CIS01
B
B1
IS
CIS
EEE01 A
B3
EE EEE
SID CID  Grade 1. What can be the primary key for above relation?
SID  Name Major
2. Decompose the above relation into 2NF then
CID  Text
3NF relations.
Major  Dept
Database Management
COP4540, SCS, FIU
Boyce-Codd Normal Form
• A relation R is in BCNF if and only if: whenever
nontrivial dependency A1 A2 … AnB1 B2 … Bm
holds for R, it must be the case that {A1, A2, …,
An} is a superkey for R.
• BCNF is one of the most important normal forms.
– Relations in BCNF have no anomaly in regards to
functional dependencies.
Database Management
COP4540, SCS, FIU
1. Each students may major in several subjects.
2. For each major, a given student has only one adviser.
3. Each major has several advisors.
4. Each advisor advises only one major.
5. Each advisor advises several students in one major.
StuId Major  Fname; Fname  Major
StuId
S1
S1
S2
S3
S4
Major
Fname
Physics Einstein
Music Mozart
Biology Darwin
Physics
Bohr
Physics Einstein
StuId
S1
S1
S2
S3
S4
Fname
Einstein
Mozart
Darwin
Bohr
Einstein
Fname
Einstein
Mozart
Darwin
Bohr
Database Management
COP4540, SCS, FIU
Major
Physics
Music
Biology
Physics
Decomposition into BCNF
1. Set D = {R};
2. While there is a relation Q in D that is not in BCNF do
begin
choose a relation Q in D that is not in BCNF;
find a FD XY in Q that violates BCNF;
Expand right side to include X+;
replace Q in D by two relations
(Q-X+)  X and X+;
end;
Database Management
COP4540, SCS, FIU
Remove
transitive
dependencies
Remove
non-atomic
attributes
Tables
1NF
2NF
Remove
partial
dependencies
3NF
BCNF
Remove remaining
anomalies resulting
from functional
dependencies
Database Management
COP4540, SCS, FIU
Download