IT420: Database Management and
Organization
Normalization
31 January 2006
Adina Cr ăiniceanu www.cs.usna.edu/~adina
Database design
Entity-Relationship (ER) Model
Relational Model
Transform ER to relational model
Database Implementation
SQL
Kroenke, Database Processing
Refresh ER to Relational
Normalization
Kroenke, Database Processing
ER:
Entities
identifiers
Relationships
cardinality
Relational model
Tables
Constraints
Kroenke, Database Processing
Transform entities in tables
Transform relationships using foreign keys
Specify logic for enforcing minimum cardinalities
Kroenke, Database Processing
Class Exercise: Transform ER model into Relational Model
CLUB
FAALicense
AIRCRAFT
FAA number
Address
Phone ModelNumber
Color
Min Cardinality:1
OWNER
Name
Phone
Min Cardinality:1 Address
MEMBER
Name
Phone
Rating
TotalHours
FLIGHT
FlightID
RentalDate
ReturnDate
TimeFlown
Kroenke, Database Processing
ER to Relational
Normalization
Kroenke, Database Processing
We have received one or more tables with data
The data is to be stored in a new database
QUESTION: Should the data be stored as received, or should it be transformed for storage?
Kroenke, Database Processing
Number LastName FirstName
190
673
312
Smith
Doe
Doe
John
Jane
Bob
Email jsmith@usna.edu
jdoe@usna.edu
bred@usna.edu
152 Johnson Matt mat@usna.edu
Rating
4
7
8
7
Wage
25
35
40
35
Application constraint:
All employees with same rating have the same wage (Rating Wage)
Problems due to data redundancy?
Kroenke, Database Processing
Deletion Anomaly : What if we delete all employees with rating 8?
Insertion Anomaly : What if we have a new employee with new rating 12?
Update Anomaly : What if we change the wage for rating 7 to be 27?
Kroenke, Database Processing
The EMPLOYEE table before and after an incorrect update operation on Wage for Rating = 7
Number LastName FirstName
190
673
312
Smith
Doe
Doe
John
Jane
Bob
152 Johnson Matt
Email jsmith@usna.edu
jdoe@usna.edu
bred@usna.edu
mat@usna.edu
Rating
4
7
8
7
Number LastName FirstName
190
673
312
Smith
Doe
Doe
John
Jane
Bob
152 Johnson Matt
Email jsmith@usna.edu
jdoe@usna.edu
bred@usna.edu
mat@usna.edu
Kroenke, Database Processing
7
Rating
4
7
8
Wage
25
35
40
35
Wage
25
37
40
35
Number LastName FirstName
190 Smith John
673
312
Doe
Doe
Jane
Bob
152 Johnson Matt
Email jsmith@usna.edu
jdoe@usna.edu
bred@usna.edu
mat@usna.edu
Rating
4
7
8
7
Wage
25
35
40
35
Number LastName FirstName
190 Smith John
Email jsmith@usna.edu
673
312
Doe
Doe
152 Johnson
Jane
Bob
Matt jdoe@usna.edu
bred@usna.edu
mat@usna.edu
Rating
4
7
8
7
Kroenke, Database Processing
Rating
4
7
8
Wage
25
35
40
Problem?
Do we have to decompose / merge?
How do we identify problems caused by redundancy?
Functional dependencies
Kroenke, Database Processing
A functional dependency: the value of one (a set of) attribute(s) determines the value of a second (set of) attribute(s):
Alpha MIDNName
Alpha (DormName, DormRoom)
The attribute on the left side of the functional dependency is called the determinant
Functional dependencies may be based on equations:
Charge = NbHours X HourlyPrice
(NbHours, HourlyPrice) Charge
Function dependencies are not equations!
Kroenke, Database Processing
Functional Dependencies Are Not Equations
ObjectColor Weight
ObjectColor Shape
ObjectColor (Weight, Shape)
Kroenke, Database Processing
Composite determinant : A determinant of a functional dependency that consists of more than one attribute
(StudentName, ClassName) Grade
(NbHours, HourlyPrice) Charge
Kroenke, Database Processing
Functional Dependency (FD) Rules
If A (B, C), then A B and A C
If (A,B) C, then neither A nor B determines C by itself
Kroenke, Database Processing
A functional dependency is a statement about all allowable instances of a table
You cannot find the functional dependencies simply by looking at some data:
Data set limitations
Must be logically a determinant
Given some data in a table R, we can check if it violates some FD, but we cannot tell if the FD holds over R!
Kroenke, Database Processing
Functional Dependencies in the
SKU_DATA Table
Assuming data is representative, determine the FD
Kroenke, Database Processing
Functional Dependencies in the
SKU_DATA Table
SKU (SKU_Description, Department, Buyer)
SKU_Description (SKU, Department, Buyer)
Buyer Department
Kroenke, Database Processing
Class Exercise
Assuming all FD that are verified by the data are indeed FD, determine the FD in the ORDER_ITEM Table
Kroenke, Database Processing
What Makes Determinant Values Unique?
A determinant is unique in a relation if, and only if, it determines every other column in the relation
Unique determinants = superkey
Kroenke, Database Processing
A set of columns is a key for a relation if :
1. No two distinct rows can have same values in all key columns, and
2. This is not true for any subset of the key
Part 2 false? A superkey
Primary key
Alternate key
Kroenke, Database Processing
Relations are categorized as a normal form based on which modification anomalies or other problems that they are subject to:
Kroenke, Database Processing
Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key
“I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Codd.”
Kroenke, Database Processing
Eliminating Modification Anomalies from
Functional Dependencies in Relations
Put all relations into Boyce-Codd Normal Form
(BCNF):
Kroenke, Database Processing
Putting a Relation into BCNF:
EQUIPMENT_REPAIR
Kroenke, Database Processing
Putting a Relation into BCNF:
EQUIPMENT_REPAIR
EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost,
RepairNumber, RepairDate, RepairAmount)
ItemNumber (Type, AcquisitionCost)
RepairNumber (ItemNumber, Type, AcquisitionCost,
RepairDate, RepairAmount)
ITEM (ItemNumber, Type, AcquisitionCost)
REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount)
Where REPAIR.ItemNumber must exist in
ITEM.ItemNumber
Kroenke, Database Processing
Putting a Relation into BCNF:
New Relations
Kroenke, Database Processing
Class Exercise
Put the following relation into BCNF
Kroenke, Database Processing
A multivalued dependency occurs when a determinant determines a particular set of values:
Employee Degree
Employee Sibling
PartKit Part
The determinant of a multivalued dependency can never be a primary key
Kroenke, Database Processing
Kroenke, Database Processing
Eliminating Anomolies from
Multivaled Dependencies
Multivalued dependencies are not a problem if they are in a separate relation, so:
Always put multivalued dependencies into their own relation
This is known as Fourth Normal Form (4NF)
Kroenke, Database Processing