Problem #1 – Data Redundancy means Multiple Updates

advertisement
Agenda
i.
ii.
iii.
iv.
Data Anomalies (problems with un-normalized data)
Writing a relation from a User View
Writing a relation from a verbal or written description.
The first step of Normalization: Eliminating Repeating Groups.
Definition:
Normalization is the process of assigning attributes to relations in such a way that data
redundancies are reduced or eliminated.
User Views can be individual descriptions, reports, forms, or lists of data that are
required to support the operations of a particular database user.
How do we Normalize?
We will Normalize our data records in three steps producing flexible and
powerful data structures that are free of redundancies.
a)
b)
c)
1st Normal Form: Eliminate Repeating Groups.
2nd Normal Form: Eliminate Partial Dependencies
3rd Normal Form: Eliminate Transitive Dependencies
1. Data Anomalies (problems with un-normalized data)
Problem #1
Problem #2
Problem #3
Problem #4
– Data Redundancy means Multiple Updates
– Update Anomaly: Means possible Inconsistent Data
– Insertion Anomaly: No Place to Hold New Information
– Deletion Anomaly: Loss of Information that we wanted to keep.
Problem #1 – Multiple Updates : The need to perform the same update in
several locations of the database because the same data is repeated.
(Ex)
Student(Student-Num,
1243658712
2343216578
3214325436
Course,
History
Java
History
Student-Name,
Tom Blu
Jill Fall
Jack Pail
Teacher, Student-Age)
Ms.Green 12
Mr.Brown 13
Ms.Green 12
If Ms.Green is replaced by Ms.White, we will have to make more than
one change to the database.
Problem #2 – Inconsistent Data: When the same data is repeated in several
records, they can be inconsistent. In the example below, which spelling is
correct? Ms. Green or Ms. Greene
(Ex)
Student(Student-Num, Course,
1243658712
History
2343216578
Java
3214325436
History
Student-Name,
Tom Blu
Jill Fall
Jack Pail
Teacher, Student-Age)
Ms.Greene 12
Mr.Brown 13
Ms.Green 12
Problem #3 - No Place to Hold New Information: Let us say we have just hired a new
teacher: Mr.Vert. We have no way to put him into the database as he has no students
yet.
No place
for Mr.Vert
(Ex)
Student(Student-Num,
1243658712
2343216578
3214325436
Student-Name,
Tom Blu
Jill Fall
Jack Pail
Teacher, Student-Age)
Ms.Greene 14
Mr.Brown 14
Ms.Green 14
Problem #4 – Loss of Information: If these students go to high school and we
remove the student records, then we will lose the information about the teachers
as well.
(Ex)
Student(Student-Num,
1243658712
2343216578
3214325436
Student-Name,
Tom Blu
Jill Fall
Jack Pail
Teacher, Student-Age)
Ms.Greene 14
Mr.Brown 14
Ms.Green 14
2. Writing a relation from a User View
CLASS LISTS FOR 2004-1
Course/Sec TeachID Teacher
DBS201I
1199 Don Frey
OOP244Q
1204
StudentID
061234978
045342973
044511982
075435973
…
Mort Moreau 067452397
…
StudentName
Ju-jin Lee
Pui-Ling Chan
Cheryl Anderson
Buu Tu
…
Julie Rivieres
…
a)
b)
c)
d)
List attributes
Show repeating groups
Select primary key (unique identifier for a row)
Give the table a name.
a ) List attributes
Course, Section, TeachID, Teacher, StudentID, StudentName
b) Show repeating groups
Course, Section, TeachID, Teacher, (StudentID, StudentName)
c) Select primary key (unique identifier for a row)
Course, Section, TeachID, Teacher, (StudentID, StudentName)
d) Give the table a name
CLASSLIST(Course, Section, TeachID, Teacher, (StudentID, StudentName))
3.Writing a relation from a verbal or written description.
Write the DBDL for the following description:
Each dentist’s office has a unique identifier for insurance companies. There is a
mailing address for the office as well as the name of the head dentist. There are
many patients and each patient has a unique identifier number.
a) List attributes
OfficeNo, MailAddress, HeadDentist, PatientNo, PatientName
b) Show repeating groups
OfficeNo, MailAddress, HeadDentist, (PatientNo, PatientName)
c) Select primary key (unique identifier for a row)
OfficeNo, MailAddress, HeadDentist, (PatientNo, PatientName)
d) Give the table a name.
DENTISTOFFICE(OfficeNo, MailAddress, HeadDentist, (PatientNo,
PatientName))
We call this 0NF or UNF (Unnormalized Form) because there are repeating groups.
4. The first step of Normalization: Eliminating Repeating Groups.
1st Normal Form: How to eliminate repeating groups.
Normalize the 0NF relations to 1NF by:
1)
2)
3)
4)
5)
Selecting the Primary Key for the repeating group.
Removing the repeating group from the relation.
Make the primary key of the repeating group the PK of the outside table plus
the key of the inside table.
The Original relation remains (without the repeating group).
Write the two relations.
DBS201J 1199 Don Frey 061234978
045342973
044511982
075435973
etc...
(Ex)
Ju-jin Lee
Pui-Ling Chan
Cheryl Anderson
Buu Tu
Our class would have as a record layout:
Class(Course Code, Section, TeacherID, TName, (Student ID, SName))
Step 1:
(Student ID, SName))
Step 2,3:
(Course Code, Section, Student ID, SName))
Step 4:
(Course Code, Section, TeacherID, TName)
Step 5:
CLASSLIST(Course Code, Section, Student ID, SName))
COURSE(Course Code, Section, TeacherID, TName)
So we get two tables after Normalizing to 1NF (First Normal Form).
Selecting the Best Primary Key:
Which is the best Primary key from the fields of the repeating group?
_____StudentID________
CLASSLIST Table
DBS201 J 061234978 Ju-jin Lee
DBS201 J 045342973 Pui-Ling Chan
DBS201 J 044511982 Cheryl Anderson
DBS201 J 075435973 Buu Tu
COURSE Table
DBS201 J 1199 Don Frey
DBS201 K 1201 Patricia Belvedere
What is the key of record 3 in the CLASSLIST table? ______________________
What is the key of record 2 in the COURSE table? _________________________
(Exercise) Convert the following un-normalized records to 1st Normal Form.
Purchases at Shoppers Drug Mart-1111 Young Street Toronto are identified by a
unique purchase # on the bill. There can be several items and the purchase must
record the item #, the quantity, the unit price, a tax code for each item, and the total
price.
Download