presentation source

advertisement
Database Management Systems & Programming
LIS 558 - Week 5
ER Model Transformation
Normalization
Faculty of Information & Media Studies
Summer 2000
Class Outline

E-R Transformation
E-R Transformation Exercises

Break



Normalization
Normalization Exercises
Steps to E-R ModelTransformation
1. Identify entities
2. Identify relationships
3. Determine relationship type
4. Determine level of participation
5. Assign an identifier for each entity
6. Draw completed E-R diagram
7. Deduce a set of preliminary skeleton tables
along with a proposed primary key for each table
(using cases provided)
8. Develop a list of all attributes of interest (not
already listed and systematically assign each to a
table in such a way to achieve a 3NF design (i.e., no
repeating groups, no partial dependencies, and no
transitive dependencies)
Transforming an E-R Model

General Rules Governing Relationships
among Tables
1. All primary keys must be defined as NOT NULL.
2. Define all foreign keys to conform to the following
requirements for binary relationships.
– 1:M Relationship
– M:N Relationship
– 1:1 Relationship
– Weak Entity
Transforming an E-R Model

1:M Relationships
• Create the foreign key by putting the primary key of
the “one” (parent) in the table of the “many”
(dependent).
• Foreign Key Rules:
Null
On Delete
On Update
If both sides are
MANDATORY
NOT NULL
RESTRICT
CASCADE
If both sides are
OPTIONAL
NULL
ALLOWED
SET NULL
CASCADE
If one side is
OPTIONAL and
the other
MANDATORY
NULL
ALLOWED
SET NULL
or
RESTRICT
CASCADE
Transforming an E-R Model

Weak Entity
• Put the key of the parent table (strong entity) in the
weak entity.
• The weak entity relationship conforms to the same
rules as the 1:M relationship, except foreign key
restrictions:
NOT NULL
ON DELETE CASCADE
ON UPDATE CASCADE

M:N Relationship
• Convert the M:N relationship to a composite (bridge)
entity consisting of (at least) the parent tables’
primary keys.
Transforming an E-R Model

1:1 Relationships
• If both entities are in mandatory participation in the
relationship and they do not participate in other
relationships, it is most likely that the two entities
should be part of the same entity.
Transforming an E-R Model

Case 1: M:N, Both sides MANDATORY
Transforming an E-R Model

Case 2: M:N, Both sides OPTIONAL
Transforming an E-R Model

Case 3: M:N, One side OPTIONAL
Transforming an E-R Model

Cases 1-3: M:N
1 M
PATIENT
N 1
prescribed
DRUG
PATIENT (PATIENT_ID, PATIENT_LNAME, PATIENT_PHYSICIAN,...)
DRUG (DRUG_ID, DRUG_NAME, DRUG_MANUFACTURER, ...)
PRESCRIBE(PATIENT_ID, DRUG_ID, DOSAGE, DATE…)
NOTE: The relationship may have its own attributes.
Example of decomposing entities
with a binary M:N relationship
Students:Classes have an M:N relationship,
therefore, decompose to three tables.
bridge table
Transforming an E-R Model

Case 4: 1:M, Both sides MANDATORY
EMPLOYEE
1
checks
M
PRODUCT
EMPLOYEE (EMP_ID, EMP_DEPT, …)
PRODUCT (PROD_ID, PROD_NAME, PROD_%FIBRE, EMP_ID... )
Transforming an E-R Model

Case 5: 1:M, Both sides OPTIONAL
PHYSIOTHERAPIST
1
has
M
CLIENTS
PHYSIOTHERAPIST (PT_ID, PT_LNAME, ...)
CLIENT (CLIENT_ID, CLIENT_LNAME, CLIENT_OHIP#, …PT_ID)
Transforming an E-R Model

Case 6: 1:M, Many side OPTIONAL, one side
MANDATORY
MACHINE
1
contains
M
PARTS
MACHINE (MACH_ID, MACH_NAME, MACH_DEPT, ...)
PART (PART_ID, PART_NAME, PART_CATEGORY, …, MACH_ID)
Transforming an E-R Model

Case 7: 1:M, One side OPTIONAL, many side
MANDATORY
BAND
1
accepts
M
MUSICIAN
BAND (BAND_ID, BAND_NAME, MUSIC_TYPE...)
MUSICIAN (MUSICIAN_ID, MUSICIAN_INSTRUMENT, … BAND_ID)
Transforming an E-R Model

Case 8: 1:1, Both Sides MANDATORY
Transforming an E-R Model

Case 8: 1:1, Both Sides MANDATORY
PLUMBER
1
1
assigned
BUILDING
PLUMBER (PLUMBER_ID, PLUMBER_LNAME,…BUILDING_ID)
BUILDING (BUILDING_ID, BUILDING_ADDRESS,...)
EMPLOYEE
1
has a
1
JOB-DESCRIPTION
EMPLOYEE (EMP_NUM, EMP_LNAME,…, JOB_DESC)
Transforming an E-R Model

Case 9: 1:1, Both Sides OPTIONAL
EXERCISER
1
has
1
TRAINER
EXERCISER (EXERCISER_ID, EXERCISER_LNAME, …TRAINER_ID)
TRAINER (TRAINER_ID, TRAINER_LNAME, ...)
Transforming an E-R Model

Case 10: 1:1, One Side OPTIONAL, One Side
MANDATORY
EMPLOYEE
1
has
1
AUTO
EMPLOYEE (EMP_ID, EMP_LNAME, EMP_PHONE,…)
AUTO (LIC_NUM, SERIAL_NUM, MAKE, MODEL,, …, EMP_ID)
Transforming an E-R Model

Case 11: Weak Entity (Foreign key located in
weak entity)
Case 11. Decomposing Weak Entities



When the relationship type of a binary relationship is 1:M
between an entity and its weak entity, two tables are required:
one for each entity, with the entity key from each entity serving
as the primary key for the corresponding table.
Additionally, the entity that has a dependency on the existence
of another entity has a primary key that is partially or totally
derived from the parent entity of the relationship.
Weak entities must be deleted when the strong entity is
deleted.
HOSPITAL
1
contains
M
UNIT
HOSPITAL (HOSP_ID, HOSP_NAME, HOSP_ADDRESS, ...)
UNIT (HOSP_ID, UNIT_NAME, HEAD_NURSE, ...)
Transforming an E-R Model

Case 12: Multivalued Attributes
Decomposing an IS-A Relationship
CLIENT
1
INDIVIDUAL
CLIENT
CORPORATE
CLIENT
Entity CLIENT contains
ClientNumber
ClientName
Address
AmountDue
SocialInsuranceNumber
TaxIdentificationNumber
ContactPerson
Phone
Problem: Too many NULL values
Solution: Separate into CLIENT entity plus several
subtypes
Decomposing an IS-A Relationship




Create a table for the parent entity and for each of the child entities or
subtypes
Move the associated attributes from the parent entity into the child table
to which they correspond
From the parent entity take the entity key and add it as the primary key to
the corresponding table for each child entity
In the event a table corresponding to a child entity already has a primary
key then simply add the entity key from the parent entity as an attribute
of the table corresponding to the child entity
CLIENT
1
INDIVIDUAL
CLIENT
CORPORATE
CLIENT
CLIENT (CLIENT_ID, AMOUNT_DUE, …)
INDIVIDUAL_CLIENT (CLIENT_ID, SIN#, …)
CORPORATE_CLIENT(CLIENT_ID, GST#, …)
Transforming Recursive Relationships
1:1 - create a foreign key field (duplicate values not allowed) that
contains the domain of primary key
Stud_ID Stud_FName Stud_LName
1
2
3
4
Rodney
Joki
Francine
Anne
Jones
Singh
Moire
Abel
Locker
Partner
4
3
2
1
1:M - create a foreign key field (duplicate values allowed) that
contains the domain of primary key
Prod_ID
Prod_Name
Base_Prod
1
2
3
4
5
6
Chicken burger
Raw Chicken
Weiner Schnitzel
Fried Chicken
Ground pork
Pork dumplings
2
5
2
5
Transforming M:N Recursive Relationships
M:N - create a second relation that contains two foreign keys: one
for each side of the relationship “course requires course.”
Decomposing Ternary relationships



When a relationship is three-way (ternary) four
preliminary tables are required: one for each
entity, with the entity key from each entity
serving as the primary key for the corresponding
table, and one for the relationship.
The table corresponding to the relationship will
have among its attributes the entity keys from
each entity
Similarly, when a relationship is N-way, N+1
preliminary tables are required.
Transforming an E-R Diagram

Converting an E-R Model into a Database
Structure
• A painter might paint many paintings. The cardinality is
(1,N) in the relationship between PAINTER and
PAINTING.
• Each painting is painted by one (and only one) painter.
• A painting might (or might not) be exhibited in a
gallery; i.e., the GALLERY is optional to PAINTING.
Transforming an E-R Model

Transformed schema for ARTIST database
PAINTER(PRT_NUM, PRT_LASTNAME,
PRT_FIRSTNAME, PRT_INITIAL,
PTR_AREACODE, PRT_PHONE)
Case 4
PAINTING(PNTG_NUM, PNTG_TITLE,
PNTG_PRICE, PTR_NUM, GAL_NUM)
Case 7
GALLERY(GAL_NUM, GAL_OWNER,
GAL_AREACODE, GAL_PHONE, GAL_RATE)
A Data Dictionary for the ARTIST Database
Library Database Example
writes
AUTHOR
M
BOOK
N
M
publishes
1
PUBLISHER
PUBLISHER (Pub_ID, ___, ___, ___, ___, …)
BOOK (ISBN, Pub_ID, ___, ___, ___, ___, …)
AUTHOR (Author_ID, ___, ___, ___, ___, …)
WRITES(ISBN, Author_ID, ___, ___, ___, ___, …)
Case 6
Case 2
University Example
M
STUDENT
takes
N
COURSE
M
M
taught
by
Case 6
ENROLL (StudID, CourseID, ___, ...)
STUDENT (StudID, ___, ___, FacID, …)
Case 2
COURSE (CourseID, ___, ___, ___, …)
FACULTY (FacID, ___, ___, ___, ___, …)
TEACH (FacID, CourseID,…)
Case 2
N
1
FACULTY
E-R Modeling &
Transformation Exercise
E-R Modeling & Transformation Exercise
Create an E-R model and define its table structures for the
following requirements.
- An INVOICE is written by a SALESREP. Each sales
representative can write many invoices, but each invoice is
written by a single sales representative.
- The INVOICE is written for a single CUSTOMER. However,
each customer may have many invoices.
- An INVOICE may include many detail lines (LINE) which
describe the products bought by the customer.
- The product information is stored in a PRODUCT entity.
- The product's vendor information is found in a VENDOR
entity.
E-R Modeling & Transformation Exercise
1
1
CUSTOMER
VENDOR
(1,1)
(1,N)
(1,N)
gene rate s
M
(1,1)
SHIPMENT
1
M
INVOICE
M
(1,1)
writes
1
(1,N)
SALES RE P
M
M
1
PRO DUCT
INV_LINE
(1,N)
(1,1)
(1,1)
(1,N)
(1,1)
(1,N)
M
1
E-R Modeling & Transformation Exercise
• Keep in mind that the preceding E-R diagram reflects a set of
business rules that may easily be modified
• For example, if customers are supplied via a commercial customer
list, many of the customers on that list will not (yet!) have bought
anything, so INVOICE would be optional to CUSTOMER
• We are assuming here that a product can be supplied by many
vendors and that each vendor can supply many products. The
PRODUCT may be optional to VENDOR if the vendor list includes
potential vendors from which you have not (yet) ordered anything.
• Some products may never sell, so LINE is optional to PRODUCT...
because an unsold product will never appear in an invoice line.
• LINE may be shown as weak to INVOICE, because it borrows the
invoice number as part of its primary key and it is existencedependent on INVOICE
• The design depends on the exact nature of the business rules.
E-R Modeling & Transformation Exercise
1
VENDOR
(0,N)
sh ips
(1,1)
1
CUSTOMER
ORDER
(1,N)
(1,1)
gene rate s
M
(1,1)
(1,1)
writes
1
(1,N)
SALES RE P
M
sh ows in
M
1
INVOICE
M
M
contains
(1,N)
M
INV_LINE
(1,1)
1
is in
(1,1)
(1,N)
PRO DUCT
(0,N)
1
E-R Modeling & Transformation Exercise
CUSTOMER (CustomerID, …)
INVOICE (InvoiceID, CustomerID, SalesRepID,…)
LINE (InvoiceID, LineID, ProdID,…)
PRODUCT (ProductID, …)
SALESREP (SalesRepID, …)
VENDOR (VendorID,…)
ORDER (OrderID, ProductID, VendorID,…)
Further E-R Transformation
Exercises
ER Modeling I handout - Q1
DIVISION (DivisionID,…ManagerID)
DEPARTMENT (DeptID,…DivisionID)
not null
EMPLOYEE (EmpID, …DeptID)
PROJECT (ProjectID,…)
EMPLOYEE_PROJECT (EmpID, ProjectID,…)
null allowed
ER Modeling I - Q2
INSTRUCTOR (InstructorID, HighestDegree, …)
COURSE (CourseID, ClassTitle, …)
CLASS (ClassID, CourseID, InstructorID, Term…)
TRAINEE (TraineeID, …)
ENROLL (TraineeID, ClassID, Term…)*
All foreign keys not null.
* Optionally, create an EnrollmentID
attribute to use as primary key.
ER Modeling I - Q3
CUSTOMER (CustomerID, …)
INVOICE (InvoiceID, CustomerID, SalesRepID,…)
LINE (InvoiceID, LineID, ProdID,…)
PRODUCT (ProductID, …)
SALESREP (SalesRepID, …)
VENDOR (VendorID,…)
SHIP (ShipID, ProductID, VendorID,…)
All foreign keys not null
ER Modeling I - Q4
AGENT (AgentID, LName, Region…)
CLIENT (ClientID, LName,…)
MUSICIAN (MusicianID, AgentID, Name,
DaysAvailable,…)
EVENT (EventID, ClientID, MusicianID, Date,
Time, Location…)
INSTRUMENT (InsturmentID, …)
MUSICIAN_INSTRUMENT (MusicianID,
InstrumentID, YearsExperience…)
All foreign keys not null.
ER Modeling I - Q5
CITY (CityID, …)
TEAM (TeamID, CoachID, CityID, …)
PLAYER (PlayerID, TeamID,…)
COACH (CoachID, TeamID,…)
GAME (GameID, HomeTeamID, VisitorTeamID,…)
All foreign keys not null.
ER Modeling II - Q1
COMPANY (CompanyID, …)
DEPARTMENT (DepartmentID, CompanyID…)
EMPLOYEE (EmployeeID, DepartmentID, …)
DEPENDENT (EmployeeID, DependentID, …)
EMPLOYEE_HISTORY (EmployeeID, HistoryID, …)
All foreign keys are not null
ER Modeling II - Q2
MEMBER (MemberID, …)
WORKOUT (WorkoutID, MemberID, Date…)
EXERCISE (ExerciseID…)
WORKOUT_EXERCISE (WorkoutID, ExerciseID,
NumberSets, NumberReps,…)
ER Modeling II - Q3
EMPLOYEE (EmployeeID, Name…PositionID)
PART_TIME_EMPLOYEE (EmployeeID,
HourlyRate…)
FULL_TIME_EMPLOYEE (EmployeeID, Salary,
OfficeRoom, …)
POSITION (PositionID, Title, Job_Description…)
All foreign keys not null.
ER Modeling II - Q4
USER (UserID, Name, Department,…)
PROBLEM (ProblemID, TimeSpent, UserID,
ResolverID,…)
HARDWARE (ProblemID, Description, Solution…)
SOFTWARE (ProblemID, SoftwareVersion, …)
RESOLVER (ResolverID, Name, Phone, Level, …)
All foreign keys not null.
ER Modeling II - Q5
EMPLOYEES (EmployeeID, SupervisorID, …)
SKILLS (SkillID, SkillName, …)
EMPLOYEE_SKILL (EmployeeID, SkillID, DateAcquired,
Certification,…)
PROJECTS (ProjectID, ProjectName, ManagerID, StartDate…)
EMPLOYEE_PROJECT (EmployeeID, ProjectID, Role…)
PROJECT_SKILL (ProjectID, SkillID, SkillLevelRequired,
NumberStaff,…)*
DEPENDENTS (EmployeeID, DependentID, DateOfBirth…)
WORK_HISTORY(EmployeeID, HistoryID,…)
BENEFITS (BenefitID, BenefitType, Company, Contact,…)
EMPLOYEE_BENEFIT (EmployeeID, BenefitID,…)
All foreign keys are not null.
* Optionally, create a ProjectSkill_ID
attribute to use as primary key.
ER Modeling II - Q6
ORCHARD (OrchardID, Location, …)
SPECIES (SpeciesID, Name, OrchardID…)
DISEASE (DiseaseID, Symptoms, Treatment,…)
SPECIES_DISEASE (SpeciesDiseaseID, SpeciesID,
DiseaseID, Date,…)*
CUSTOMER (CustomerID, …)
ORDER (OrderID, CustomerID, …)
ORDERDETAILS (OrderID, DetailID, SpeciesID,…)
All foreign keys not null.
* Optionally, use the combination of
SpeciesID, DiseaseID and Date as primary
key and remove SpeciesDiseaseID entirely.
Class Outline


E-R Transformation
E-R Transformation Exercises
 Break


Normalization
Normalization Exercises
Transformation & Normalization
1. Identify entities
2.
3.
4.
5.
6.
7.
Identify relationships
Determine relationship type
Determine level of participation
Assign an identifier for each entity
Draw completed E-R diagram
Deduce a set of preliminary skeleton tables along with
a proposed primary key for each table (using rules
provided)
8. Develop a list of all attributes of interest (not already
listed and systematically assign each to a table in such
a way to achieve a 3NF design (i.e., no repeating
groups, no partial dependencies, and no transitive
dependencies)
Database Design Problems




Database design is the process of separating
information into multiple tables that are related
to each other
Single table designs work only for the simplest
of situations in which data integrity problems
are easy to correct
Anomalies (abnormalities) often arise in single
table designs as a result of inserting, deleting,
or updating records
Some tables are better structured than others
(i.e., result in fewer anomalies)
Database Design Problems

Numerous anomalies can arise during the
design of databases
•
•
•
•
•
Redundancy
Multi-valued problems
Update anomalies
Insertion anomalies
Deletion anomalies
The Problem with Nulls
1. Nulls used in mathematical expressions
- unknown quantity leads to unknown total value
- misleading value of all inventory
Product ID
Product Description
Category
Accessories
Price
Quantity Total Value
801
Shur-Lock U-Lock
75.00
802
SpeedRite Cyclecomputer
60.00
20
1,200.00
803
SteelHead Microshell HelmetAccessories
40.00
40
1,600.00
804
SureStop 133-MB Brakes
Components
25.00
10
250.00
805
Diablo ATM Mountain Bike
Bikes
806
Ultravision Helmet Mount Mirrors
10
74.50
1,200.00
7.45
Total:
Category
Total Occurences
0
Accessories
2
Bikes
1
Components
1
3,124.50
2. Nulls used in aggregate functions
- blanks exist under category
- cannot be counted because they don’t
exist!
Database Design Problems



Use of the relational database model removes
some database anomalies
Further removal of database anomalies relies
on a structured technique called normalization
Presence of some of these anomalies is
sometimes justified in order to enhance
performance
Database design consists of balancing the art
of design with the science of design
Normalization



Goal in database design to create well-structured
tables
Transform E-R models to tables following the
rules provided
Assuring tables are well-structured with minimal
problems (redundancy, multi-valued attributes,
update anomalies, insertion anomalies, deletion
anomalies) is achieved using structured technique
called normalization
Normalization


Normalization is the structured decomposition of
one table into two or more tables using a
procedure designed to determine the most
appropriate split
Normalization our method of making sure the E-R
design was correct in the first place
Rules for Normalization

Basic #1 Rule
• The attribute values in a relational table
should be functionally dependent (FD) on the
primary key value.
• In any table, a field A is said to be
functionally dependent on field B if,
regardless of any insertions or deletions, the
value of B determines the value of A (in other
words only one value of A occurs with a
particular value of B)
Rules for Normalization

First Normal Form (1NF)
• A table cannot have repeating fields or
groups (i.e., must remove redundant data)
• Repeating groups are removed by creating
another table which holds those attributes
that repeat. This second table is then linked
to the original table with an identifier (i.e.,
foreign key)
Rules for Normalization

Second Normal Form (2NF)
• Table is in 1NF
• All nonkey fields in a table must be
functionally dependent on all of the key (i.e.,
remove all partial dependencies)
• 2NF is primarily concerned with
dependencies involving a concatenated
primary key (nonkey fields must be
functionally dependent on the entire
concatenated key not just one attribute of
the composite key)
Rules for Normalization

Third Normal Form (3NF)
• Table is in 2NF
• A nonkey field cannot be functionally
dependent on another nonkey field (i.e.,
remove transitive dependencies by placing
attributes involved in a new relational table)
Rules for Normalization






Fourth Normal Form (4NF)
Boyce-Codd Normal Form (BCNF)
Fifth Normal Form (5NF)
Domain-Key Normal Form (DKNF)
For most database designs 3NF is
sufficient
3NF is level for designing in this course
First Normal Form

A table is in first normal form if it meets the
following criteria: The data are stored in a twodimensional table with no two rows identical and
there are no repeating groups.
• The following table in NOT in first normal form
because it contains a multi-valued attribute (an
attribute with more than one value in each row).
Member_ID Memb_FName Memb_LName
Hobbies
1
Rodney
Jones
hiking, cooking
3
Francine
Moire
golf, theatre, hiking
2
Anne
Abel
concerts
Handling multi-valued attributes: Incorrect Solutions
Member_ID Memb_FName Memb_LName
Hobbies
1
Rodney
Jones
hiking, cooking
3
Francine
Moire
golf, theatre, hiking
2
Anne
Abel
concerts
Member_ID Memb_FName Memb_LName Hobby1 Hobby2 Hobby3
1
Rodney
Jones
hiking
cooking
3
Francine
Moire
golf
theatre
hiking
2
Anne
Abel
concerts
Member_ID Memb_FName Memb_LName Hobbies
1
Rodney
Jones
fishing
1
Rodney
Jones
cooking
3
Francine
Moire
golf
3
Francine
Moire
theatre
3
Francine
Moire
hiking
2
Anne
Abel
concerts
Handling multi-valued attributes: Correct Solution

Create another entity (table) to handle multiple instances of the
repeating group. This second table is then linked to the original
table with an identifier (i.e., foreign key). This solution has the
following advantages:
• no limit to the number of hobbies per member
• no waste of disk space
• searching becomes much easier within a column (e.g., who likes
hiking?)
Member_ID Memb_FName Memb_LName
Hobbies
1
Rodney
Jones
hiking, cooking
3
Francine
Moire
golf, theatre, hiking
2
Anne
Abel
concerts
Member_ID Memb_FName Memb_LName
1
Rodney
Jones
3
Francine
Moire
2
Anne
Abel
Member_ID
1
1
3
3
3
2
Hobby
hiking
cooking
golf
theatre
hiking
concerts
Handling Repeating Groups


An attribute can have a group of several data entries. Repeating
groups can be removed by creating another table which holds
those attributes that repeat. This second table (validation table)
is then linked to the original table with an identifier (i.e., foreign
key)
Advantages: fewer characters tables; reduces miskeying, update
anomalies
Product_ID
Product_Name
Category
Price
801
Shur-Lock U-Lock
Accessory
75.00
802
SpeedRite Cyclecomputer
Component
60.00
803
SteelHead Microshell Helmet Accessory
40.00
804
SureStop 133-MB Brakes
Component
25.00
805
Diablo ATM Mountain Bike
Bike
806
Ultravision Helmet Mount Mirrors
Accessory
Product_ID
Product_Name
Category
801
Shur-Lock U-Lock
1
802
SpeedRite Cyclecomputer
2
803
SteelHead Microshell Helmet
1
804
SureStop 133-MB Brakes
2
805
Diablo ATM Mountain Bike
3
806
Ultravision Helmet Mount Mirrors 1
Price
75.00
60.00
40.00
25.00
1200.00
7.45
1,200.00
7.45
Category_ID Category
1
Accessory
2
Component
3
Bike
Second Normal Form

A table is in second normal form if it meets the following
criteria: The relation is in first normal form, and, all
nonkey attributes are functionally dependent on the
entire primary key (no partial dependencies).
• Applies only to tables that have a composite primary key.
• In the following table, both the EmpID and Training
(composite primary key) determine Date, whereas, only
EmpID (part of the primary key) determines Dept.
EmpID Training
1
Word
3
Excel
2
Excel
1
Access
Date
12-Sep-99
14-Oct-99
14-Oct-99
23-Nov-99
Dept
Oncology
Paediatrics
Renal
Oncology
Removing Partial Dependencies

Remove partial dependencies by separating the relation into
two relations. Reduces the problems of:
•
•
•
•
update anomalies
delete anomalies
insert anomalies
redundancies
EmpID
1
3
2
1
EmpID
1
3
2
1
Training
Word
Excel
Excel
Access
Date
12-Sep-99
14-Oct-99
14-Oct-99
23-Nov-99
Training
Word
Excel
Excel
Access
Date
12-Sep-99
14-Oct-99
14-Oct-99
23-Nov-99
EmpID
1
2
3
Dept
Oncology
Paediatrics
Renal
Oncology
Dept
Oncology
Renal
Paediatrics
Third Normal Form

A table is in third normal form if it meets the following
criteria: The relation is in second normal form, and, a
nonkey field is not functionally dependent on another
nonkey field (no transitive dependencies).
• The following table is in second normal form but NOT in
third normal form because Member_Id (the primary key)
does not determine every attribute (does not determine
RegistrationFee). RegistrationFee is determined by Sport.
Member_ID
1
3
2
4
Memb_FName
Rodney
Francine
Anne
Goro
Memb_LName
Jones
Moire
Abel
Azuma
Sport
Swimming
Tennis
Tennis
Skiing
RegistrationFee
$100
$200
$200
$150
Member ID  FName, LName, Lesson; Lesson  Cost
Removing non-key Transitive Dependencies

Remove transitive dependencies by placing attributes
involved in a new relational table. Reduces the problems of:
•
•
•
•
update anomalies
delete anomalies
insert anomalies
redundancies
MemberID
1
3
2
4
MemberID
1
3
2
4
MembFName
Rodney
Francine
Anne
Goro
MembFName
Rodney
Francine
Anne
Goro
MembLName Sport
Jones
1
Moire
2
Abel
2
Azuma
1
MembLName
Jones
Moire
Abel
Azuma
Sport RegFee
Swimming $100
Tennis
$200
Tennis
$200
Skiing
$150
SportID
Sport
RegFee
1
Swimming $100
2
Tennis
$200
3
Skiing
$150
Normalization Example: Video Store
A video rental shop tracks all of their information in one
table. There are now 20,000 records in it. Is it possible
to achieve a more efficient design? (They charge
$10/movie/day.)
Cust_Name
Rodney Jones
Francine Moire
Anne Abel
Rodney Jones
Cust_address
23 Richmond St.
750-12 Kipps Lane
5 Sarnia Road
23 Richmond St.
Cust_Phone
681-9854
672-9999
432-1120
681-9854
Rental_date Video_1
Video_2
15-Oct-99 Gone with the Wind
Braveheart
4-Nov-99
Manhatten
3-Sep-99
Manhatten
The African Queen
22-Sep-99 Never Say Never Silence
Again of the Lambs
Video_3
VideoType_1
Mississippi Burning
Classic
Comedy
Comedy
Adventure
Return_date
VideoType_2 VideoType3
17-Oct-99
Adventure
Adventure
Classic
Horror
4-Sep-99
26-Sep-99
TotalPrice
Paid?
$
60.00 yes
$
$
20.00 yes
80.00 yes
VIDEO (Cust_name, Cust_address, Cust_phone, Rental_date, Video_1, Video_2,
Video_3, VideoType_1, VideoType_2, VideoType3, Return_date, Total_Price,
Paid?)
Normalization Example: Video Store
Cust_Name
Rodney Jones
Francine Moire
Anne Abel
Rodney Jones
Cust_address
23 Richmond St.
750-12 Kipps Lane
5 Sarnia Road
23 Richmond St.
Video_1
Video_2
Gone with the Wind
Braveheart
Manhatten
Manhatten
The African Queen
Never Say Never Silence
Again of the Lambs
Cust_Phone
681-9854
672-9999
432-1120
681-9854
Video_3
VideoType_1
Mississippi Burning
Classic
Comedy
Comedy
Adventure
Return_date
17-Oct-99
TotalPrice
Paid?
$
60.00 yes
4-Sep-99
26-Sep-99
$
$
20.00 yes
80.00 yes
Rental_date
15-Oct-99
4-Nov-99
3-Sep-99
22-Sep-99
VideoType_2 VideoType3
Adventure
Adventure
Classic
Horror
Is the Video store in 1NF?
No attributes should form repeating groups - remove them by creating
another table. There are repeating groups for videos and customers.
CUSTOMER (Cust_Num, Cust_Name, Cust_address_Cust_phone
Cust_Num
Cust_Name
1
Rodney Jones
2
Francine Moire
3
Anne Abel
Cust_address Cust_Phone
23 Richmond St.681-9854
750-12 Kipps Lane
672-9999
5 Sarnia Road 432-1120
VIDEO (VideoNum, VideoName, VideoType
VideoNum
1
2
3
4
5
6
7
VideoName
VideoType
Gone with the Wind
Classic
Manhatten
Comedy
Never Say Never Again
A dventure
Braveheart
Adventure
Mississippi Burning Adventure
The African Queen
Classic
Silence of the Lambs
Horror
RENTAL (Cust_num, VideoNum, Rental_date, Return_date, TotalPrice, Paid?)
Cust_Num VideoNum Rental_date Return_date
1
1,4,5
15-Oct-99 17-Oct-99
2
2
4-Nov-99
3
2,6
3-Sep-99
4-Sep-99
1
3,7
22-Sep-99 26-Sep-99
TotalPrice
$
60.00
$
$
20.00
80.00
Paid?
yes
yes
yes
Video Store: 1NF
(cont’d)
Have not yet removed all repeating groups - video is a multivalued attribute - move to another table.
Cust_Num VideoNum Rental_date Return_date
1
1,4,5
15-Oct-99 17-Oct-99
2
2
4-Nov-99
3
2,6
3-Sep-99
4-Sep-99
1
3,7
22-Sep-99 26-Sep-99
RentalNum Cust_Num
1
1
2
2
3
3
4
1
Rental_date Return_date TotalPrice Paid?
15-Oct-99
17-Oct-99 $ 60.00
yes
4-Nov-99
3-Sep-99
4-Sep-99 $ 20.00
yes
22-Sep-99 26-Sep-99 $ 80.00
yes
RENTAL (RentalNum, Cust_Num, Rental_date,
Return_Date, TotalPrice, Paid?)
TotalPrice
$
60.00
$
$
20.00
80.00
Paid?
yes
yes
yes
RentalNum VideoNum
1
1
1
4
1
5
2
2
3
2
3
6
4
3
4
7
RENTALDETAILS
(RentalNum,
VideoNum)
The Video Store is now in 1NF
CUSTOMER (Cust_Num, Cust_Name, Cust_address, Cust_phone
Cust_Num
Cust_Name
1
Rodney Jones
2
Francine Moire
3
Anne Abel
Cust_address Cust_Phone
23 Richmond St.681-9854
750-12 Kipps Lane
672-9999
5 Sarnia Road 432-1120
VideoNum
VideoName
VideoType
1
Gone with the Wind Classic
2
Manhatten
Comedy
VIDEO (VideoNum, VideoName, VideoType
3
Never Say Never Again
Adventure
4
Braveheart
Adventure
RentalNum VideoNum
5
Mississippi Burning Adventure
1
1
6
The African Queen
Classic
1
4
7
Silence of the Lambs Horror
1
5
2
2
RentalNum Cust_Num Rental_date Return_date TotalPrice Paid?
3
2
1
1
15-Oct-99
17-Oct-99 $
60.00
yes
3
6
2
2
4-Nov-99
4
3
3
3
3-Sep-99
4-Sep-99 $
20.00
yes
4
7
4
1
22-Sep-99
26-Sep-99 $
80.00
yes
RENTALDETAILS
RENTAL (RentalNum, Cust_Num, Rental_date, Return_Date,
TotalPrice, Paid?)
(RentalNum,
VideoNum)
Is the Video Store in 2NF?
The only table that has a composite primary key has no
other fields, therefore, yes.
Cust_Num
Cust_Name
1
Rodney Jones
2
Francine Moire
3
Anne Abel
VideoNum
1
2
3
4
5
6
7
Cust_address Cust_Phone CUSTOMER (Cust_Num,
23 Richmond St.681-9854 Cust_Name, Cust_address,
750-12 Kipps Lane
672-9999 Cust_phone
5 Sarnia Road 432-1120
VideoName
VideoType
Gone with the Wind Classic
Manhatten
Comedy
Never Say Never Again
Adventure
Braveheart
Adventure
Mississippi Burning Adventure
The African Queen
Classic
Silence of the Lambs Horror
RentalNum Cust_Num
1
1
2
2
3
3
4
1
VIDEO (VideoNum, VideoName,
VideoType
Rental_date Return_date TotalPrice
15-Oct-99
17-Oct-99 $
60.00
4-Nov-99
3-Sep-99
4-Sep-99 $
20.00
22-Sep-99
26-Sep-99 $
80.00
RENTAL (RentalNum, Cust_Num, Rental_date,
Return_Date, TotalPrice, Paid?)
Paid?
yes
yes
yes
RentalNum VideoNum
1
1
1
4
1
5
2
2
3
2
3
6
4
3
4
7
RENTALDETAILS
(RentalNum, VideoNum)
Is the Video Store in 3NF?
Does each attribute in each table depend upon the primary key?
Cust_Num
Cust_Name
1
Rodney Jones
2
Francine Moire
3
Anne Abel
VideoNum
1
2
3
4
5
6
7
Cust_address Cust_Phone
23 Richmond St.
681-9854
750-12 Kipps Lane
672-9999
5 Sarnia Road 432-1120
VideoName
VideoType
Gone with the Wind Classic
Manhatten
Comedy
Never Say Never Again
Adventure
Braveheart
Adventure
Mississippi BurningAdventure
The African Queen Classic
Silence of the LambsHorror
RentalNum Cust_Num
1
1
2
2
3
3
4
1
Rental_date Return_date TotalPrice
15-Oct-99
17-Oct-99 $
60.00
4-Nov-99
3-Sep-99
4-Sep-99 $
20.00
22-Sep-99
26-Sep-99 $
80.00
RentalNum VideoNum
1
1
1
4
1
5
2
2
3
2
3
6
4
3
4
7
Paid?
yes
yes
yes
The Video Store is now in 3NF
Because, in each table every attribute depends on the primary
key and not on any other key.
Cust_Num
Cust_Name
1
Rodney Jones
2
Francine Moire
3
Anne Abel
VideoNum
1
2
3
4
5
6
7
Cust_address Cust_Phone
23 Richmond St.
681-9854 CUSTOMER (Cust_Num,
Cust_Name,
750-12 Kipps Lane
672-9999
Cust_address,
5 Sarnia Road 432-1120
Cust_phone)
VideoName
VideoType
Gone with the Wind Classic
Manhatten
Comedy
Never Say Never Again
Adventure
Braveheart
Adventure
Mississippi BurningAdventure
The African Queen Classic
Silence of the LambsHorror
RentalNum Cust_Num
1
1
2
2
3
3
4
1
Rental_date
15-Oct-99
4-Nov-99
3-Sep-99
22-Sep-99
RENTAL (RentalNum, Cust_Num,
Rental_date)
VIDEO (VideoNum,
VideoName, VideoType)
RentalNum VideoNum ReturnDate
1
1
16-Oct-99
1
4
17-Oct-99
1
5
16-Oct-99
2
2
5-Nov-99
3
2
4-Sep-99
3
6
6-Sep-99
4
3
24-Sep-99
4
7
16-Sep-99
Amt_Paid
$10
$20
$10
$10
0
0
$5
0
RENTALDETAILS (RentalNum, VideoNum,
ReturnDate, Amt_Paid)
Normalization Example: ARTIST
Checking Transformed ER Model

Transformed schema for ARTIST database
PAINTER(PRT_NUM, PRT_LASTNAME,
PRT_FIRSTNAME, PRT_INITIAL,
PTR_AREACODE, PRT_PHONE)
PAINTING(PNTG_NUM, PNTG_TITLE,
PNTG_PRICE, PTR_NUM, GAL_NUM)
GALLERY(GAL_NUM, GAL_OWNER,
GAL_AREACODE, GAL_PHONE, GAL_RATE)
1NF? 
2NF? 
3NF? 
Checking Transformed RE Model
1
VENDOR
(0,N)
sh ips
(1,1)
1
CUSTOMER
ORDER
(1,N)
(1,1)
gene rate s
M
(1,1)
(1,1)
writes
1
(1,N)
SALES RE P
M
sh ows in
M
1
INVOICE
M
M
contains
(1,N)
M
INV_LINE
(1,1)
1
is in
(1,1)
(1,N)
PRO DUCT
(0,N)
1
Checking Transformed ER Model
CUSTOMER (CustomerID, …)
INVOICE (InvoiceID, CustomerID, SalesRepID,…)
LINE (InvoiceID, LineID, ProdID,…)
PRODUCT (ProductID, …)
SALESREP (SalesRepID, …)
VENDOR (VendorID,…)
ORDER (OrderID, ProductID, VendorID,…)
1NF? 
2NF? 
3NF? depends on
placement of
attributes
Normalization Exercises
Normalization Exercises
To keep track of office furniture, computers, printers,
and so on, the FOUNDIT company uses the following table
structure:
Attribute name
Sample value
ITEM_ID
ITEM_DESCRIPTION
BLDG_ROOM
BLDG_CODE
BLDG_NAME
BLDG_MANAGER
2311345-678
HP DeskJet 660C printer
325
DEL
Dawn's Early Light
E. R. Rightonit
Given this information, draw the dependency diagram.
Make sure you label the transitive and/or partial
dependencies.
Normalization Exercises
ITE M_ ID
ITE M_ DESC RI PTION
BLDG_ RO OM
B LDG_C ODE
BL DG_ NAM E
Transitive De pend encie s
BLDG_ MAN AGE R
Normalization Exercises
All ta bles in 3NF
ITEM_ID ITEM_DESCRIPTION
ITEM_ROOM BLDG_COD E
BLDG_CODE
BLDG_NAME
EMP_CODE
EMP_CODE
EMP_LNAME
EMP_FNAME
EMP_INI TIAL
Normalization Exercises
1
EMPLOYEE
(0,N)
manages
(1,1)
M
1
BUI LDING
M
contains
(1,N)
EMPLOYEE
ITEM
(1,1)
BUI LDING
ITEM
EMP_CODE
BLDG_CODE
ITEM_ID
EMP_LNA ME
BLDG_NA ME
ITEM_DESCRIPTION
EMP_FNAME
EMP_CODE
ITEM_ROOM
EMP_INI TI AL
BLDG_CODE
Normalization Exercises
Conflicting Goals of Design

Database design must reconcile the following
requirements:
• Design elegance requires that the design must adhere to
design rules concerning nulls, derived attributes,
redundancies, relationship types, etc.
• Information requirements are dictated by the end users
• Operational (transaction) speed requirements are also
dictated by the end users

Clearly, an elegant database design that fails to
address end user information requirements or one
that forms the basis for an implementation whose
use progresses at a snail's pace has little practical
use.
Characteristics of Fields


Each field within a table must have a unique
name (avoid spaces and special characters).
Data within a field must be of the same data
type. The following are common data types:
•
•
•
•
•
•
•
•
character (text or string)
memo (large character field)
integer (whole numbers for calculations)
number (values with decimals for calculations)
currency (formatted number)
logical or Boolean (true/false; 0,-1; yes/no)
date/ time (use computer’s internal calendar/clock)
graphic (picture)
Guidelines for Ideal Table Design








Each table should represents a single theme or subject or entity
or transaction
Tables should include primary keys that uniquely identify each
record of each table
Avoid the use of smart keys that attempt to embed meaning into
primary keys (keys should be meaningless)
A primary key should be a unique, random or sequential collection
of alphabetic, numeric or alphanumeric characters
The domain of primary keys should be large enough to
accommodate the identification of unique rows for the entire
potential universe of records
Use the suffix ID in constructing primary keys to ensure they are
readily identifiable
Tables should not contain any of the following: multipart fields,
multivalued fields, calculated or derived fields or unnecessary
duplicate fields
There should be a minimum amount of redundant data
Common Errors in Database Design

Flat file database

Duplicate field names

Too much data

Cryptic field or table names

Compound fields

Referential integrity

Missing keys

Database Security

Bad keys


Missing relationships


Unnecessary
relationships

Missing or incorrect business
rules
Missing or incorrect
constraints
Incorrect
relationships
Ashenfelter, J. P. (March 26, 1999). Common Database Mistakes. Found online at
<http://webreview.com/wr/pub/1999/03/26/feature/index3.html> (June 5, 2000).
The Well-Structured Database





E-R modeling is top-down method of designing
Transforming an E-R model does not guarantee
the best design (e.g., E-R model could be way
off)
Best to transform E-R model and then check
the design according to the Cases of
normalization
Normalization is bottom-up method of designing
a database
Use both approaches to develop a wellstructured database
For Week 6





Assignment #2 due June 12
Structured Query Language
Discuss project assignment
Read Rob, Chapter 3.1-3.6 and Chapter 6
Work on Adamski Tutorial #5
Download