IT420: Database Management and Organization Normalization 31 January 2006

advertisement

IT420: Database Management and

Organization

Normalization

31 January 2006

Adina Cr ăiniceanu www.cs.usna.edu/~adina

Previously on IT420

 Database design

 Entity-Relationship (ER) Model

 Relational Model

 Transform ER to relational model

 Database Implementation

 SQL

Kroenke, Database Processing

Goal

 Refresh ER to Relational

 Normalization

Kroenke, Database Processing

ER to Relational

 ER:

 Entities

 identifiers

 Relationships

 cardinality

 Relational model

 Tables

 Constraints

Kroenke, Database Processing

ER to Relational

 Transform entities in tables

 Transform relationships using foreign keys

 Specify logic for enforcing minimum cardinalities

Kroenke, Database Processing

Class Exercise: Transform ER model into Relational Model

CLUB

FAALicense

AIRCRAFT

FAA number

Address

Phone ModelNumber

Color

Min Cardinality:1

OWNER

Name

Phone

Min Cardinality:1 Address

MEMBER

Name

Phone

Rating

TotalHours

FLIGHT

FlightID

RentalDate

ReturnDate

TimeFlown

Kroenke, Database Processing

Outline

 ER to Relational

 Normalization

Kroenke, Database Processing

Premise

 We have received one or more tables with data

 The data is to be stored in a new database

 QUESTION: Should the data be stored as received, or should it be transformed for storage?

Kroenke, Database Processing

Data Redundancy

Number LastName FirstName

190

673

312

Smith

Doe

Doe

John

Jane

Bob

Email jsmith@usna.edu

jdoe@usna.edu

bred@usna.edu

152 Johnson Matt mat@usna.edu

Rating

4

7

8

7

Wage

25

35

40

35

Application constraint:

All employees with same rating have the same wage (Rating  Wage)

Problems due to data redundancy?

Kroenke, Database Processing

Modification Anomalies

 Deletion Anomaly : What if we delete all employees with rating 8?

 Insertion Anomaly : What if we have a new employee with new rating 12?

 Update Anomaly : What if we change the wage for rating 7 to be 27?

Kroenke, Database Processing

Update Anomalies

 The EMPLOYEE table before and after an incorrect update operation on Wage for Rating = 7

Number LastName FirstName

190

673

312

Smith

Doe

Doe

John

Jane

Bob

152 Johnson Matt

Email jsmith@usna.edu

jdoe@usna.edu

bred@usna.edu

mat@usna.edu

Rating

4

7

8

7

Number LastName FirstName

190

673

312

Smith

Doe

Doe

John

Jane

Bob

152 Johnson Matt

Email jsmith@usna.edu

jdoe@usna.edu

bred@usna.edu

mat@usna.edu

Kroenke, Database Processing

7

Rating

4

7

8

Wage

25

35

40

35

Wage

25

37

40

35

Table decomposition

Number LastName FirstName

190 Smith John

673

312

Doe

Doe

Jane

Bob

152 Johnson Matt

Email jsmith@usna.edu

jdoe@usna.edu

bred@usna.edu

mat@usna.edu

Rating

4

7

8

7

Wage

25

35

40

35

Number LastName FirstName

190 Smith John

Email jsmith@usna.edu

673

312

Doe

Doe

152 Johnson

Jane

Bob

Matt jdoe@usna.edu

bred@usna.edu

mat@usna.edu

Rating

4

7

8

7

Kroenke, Database Processing

Rating

4

7

8

Wage

25

35

40

Problem?

Decisions

 Do we have to decompose / merge?

 How do we identify problems caused by redundancy?

 Functional dependencies

Kroenke, Database Processing

Functional Dependency (FD)

 A functional dependency: the value of one (a set of) attribute(s) determines the value of a second (set of) attribute(s):

Alpha  MIDNName

Alpha  (DormName, DormRoom)

 The attribute on the left side of the functional dependency is called the determinant

 Functional dependencies may be based on equations:

Charge = NbHours X HourlyPrice

(NbHours, HourlyPrice)  Charge

 Function dependencies are not equations!

Kroenke, Database Processing

Functional Dependencies Are Not Equations

ObjectColor  Weight

ObjectColor  Shape

ObjectColor  (Weight, Shape)

Kroenke, Database Processing

Composite Determinants

 Composite determinant : A determinant of a functional dependency that consists of more than one attribute

 (StudentName, ClassName)  Grade

 (NbHours, HourlyPrice)  Charge

Kroenke, Database Processing

Functional Dependency (FD) Rules

 If A  (B, C), then A  B and A  C

 If (A,B)  C, then neither A nor B determines C by itself

Kroenke, Database Processing

FD Facts

 A functional dependency is a statement about all allowable instances of a table

 You cannot find the functional dependencies simply by looking at some data:

 Data set limitations

 Must be logically a determinant

 Given some data in a table R, we can check if it violates some FD, but we cannot tell if the FD holds over R!

Kroenke, Database Processing

Functional Dependencies in the

SKU_DATA Table

Assuming data is representative, determine the FD

Kroenke, Database Processing

Functional Dependencies in the

SKU_DATA Table

SKU  (SKU_Description, Department, Buyer)

SKU_Description  (SKU, Department, Buyer)

Buyer  Department

Kroenke, Database Processing

Class Exercise

Assuming all FD that are verified by the data are indeed FD, determine the FD in the ORDER_ITEM Table

Kroenke, Database Processing

What Makes Determinant Values Unique?

 A determinant is unique in a relation if, and only if, it determines every other column in the relation

 Unique determinants = superkey

Kroenke, Database Processing

Key

 A set of columns is a key for a relation if :

1. No two distinct rows can have same values in all key columns, and

2. This is not true for any subset of the key

 Part 2 false? A superkey

 Primary key

 Alternate key

Kroenke, Database Processing

Normal Forms

 Relations are categorized as a normal form based on which modification anomalies or other problems that they are subject to:

Kroenke, Database Processing

Normal Forms

 Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key

“I swear to construct my tables so that all nonkey columns are dependent on the key, the whole key and nothing but the key, so help me Codd.”

Kroenke, Database Processing

Eliminating Modification Anomalies from

Functional Dependencies in Relations

 Put all relations into Boyce-Codd Normal Form

(BCNF):

Kroenke, Database Processing

Putting a Relation into BCNF:

EQUIPMENT_REPAIR

Kroenke, Database Processing

Putting a Relation into BCNF:

EQUIPMENT_REPAIR

EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost,

RepairNumber, RepairDate, RepairAmount)

ItemNumber  (Type, AcquisitionCost)

RepairNumber  (ItemNumber, Type, AcquisitionCost,

RepairDate, RepairAmount)

ITEM (ItemNumber, Type, AcquisitionCost)

REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount)

Where REPAIR.ItemNumber must exist in

ITEM.ItemNumber

Kroenke, Database Processing

Putting a Relation into BCNF:

New Relations

Kroenke, Database Processing

Class Exercise

Put the following relation into BCNF

Kroenke, Database Processing

Multivalued Dependencies

 A multivalued dependency occurs when a determinant determines a particular set of values:

Employee  Degree

Employee  Sibling

PartKit  Part

 The determinant of a multivalued dependency can never be a primary key

Kroenke, Database Processing

Multivalued Dependencies

Kroenke, Database Processing

Eliminating Anomolies from

Multivaled Dependencies

 Multivalued dependencies are not a problem if they are in a separate relation, so:

 Always put multivalued dependencies into their own relation

 This is known as Fourth Normal Form (4NF)

Kroenke, Database Processing

Download