Lecture 1

advertisement
Lecture 4
• Normalization
• An Normalization
example
INSS 651
1
Normalization
A process for evaluating and correcting table
structure
Minimize data redundancy
Eliminate Anomalies
INSS 651
2
Is Normalization Necessary?
NO
But it is helpful to maintain data integrity and
consistency
INSS 651
3
Denormalization
• Reversing normalization
• i.e from 3rd NF to 2nd NF
• Or 2nd to 1st NF
INSS 651
4
Anomalies
• Update—requires update in multiple
locations
• Deletion—A deletion may lose important
information
• Insertion—Requires complete definitions,
ie does see page 187 (an employee can
not be entered unless he is assigned a
project
INSS 651
5
Normalization Process
• 1st NF
• 2nd NF
• 3rd NF
• Almost for 90-98% application 3rd NF is
sufficient
INSS 651
6
Dependency
When an attribute value depends on
attribute B then B is dependent on A
A---B
or values of B can be determined by value of
A, reverse may or may not be true
Ex:
ssn--Name
SSN, CID--Grade
INSS 651
7
Un-normalized relation
Remove REPEATING groups
1st NF
Remove PARTIAL dependency
2nd NF
Remove TRANSIENT dependency
3rd NF
Every determinant is a candidate key
Boyce-CODD NF
If we can convert a relation into 3NF almost 90-98% of anomalies are removed
INSS 651
8
Un-normalized form
A relation is in un-normalized form, if it contains repeating group
Typically shown in parentheses
Ex: PART NO DESC.
VENDOR-NAME ADDRESS UNIT-COST
1234
LOGIC
chip
INTEL
LSI LOGIC
5678
MEMORY INTEL
chip
SAN JOSE 150.00
SAN JOSE 120.00
SAN JOSE
50.00
SUPPLIER
(Part_no, Part_DESC, (Vendor_name, Vendor_address, Unit_cost))
INSS 651
9
Another Way
(Part_NO, V_NAME)-> Unit_cost
Part_NO->P_Desc (Partial dependency)
V-Name->V_DESC (Partial Dependency)
Part_NO
P-DESC
V_NAME
V_ADDRESS
INSS 651
UNIT_COST
10
1st NF
A relation is in 1st NF if it does NOT contain any repeating groups
(Part_no, Part_DESC, (Vendor_name, Vendor_address, Unit_cost))
1st NF..remove repeating groups
Break it into TWO relations
One without repeating group and
ONE with repeating group AND PK of other relation
S1 (Part_no, Part_DESC)
S2 (Vendor_name, Part_no, Vendor_address, Unit_cost
INSS 651
11
2nd NF
A relation is in 2nf NF if it is in 1stNF and it does not
contain any partial dependency
Partial dependency: A partial dependency exists if an
attribute is dependent ONLY on PART of the PK and the
WHOLE PK
We must examine each relation for partial dependency
NOTE: A partial dependency can only exist if there are
more than ONE attribute as PK
INSS 651
12
S1 (Part_no, Part_DESC)
S2 (Vendor_name, Part_no, Vendor_address,
Unit_cost
Note S1 is already in 2nd NF since there is only
attribute as PK
In S2:
Question is Vendor_address dependent on BOTH
vendor_name AND Part_NO?
Question is Unit_price dependent on BOTH
vendor_name AND Part_NO?
INSS 651
13
Question is Vendor_address dependent on BOTH
vendor_name AND Part_NO?
Answer: NO
Give me vendor_no and I can find
vendor_address, we do NOT need Part_No to
know vendor_address, ie Vendor_address
depends ONLY Vendor_name, hence the partial
dependency
INSS 651
14
Question is Unit_price dependent on BOTH
vendor_name AND Part_NO?
YES if you examine the table, price changes
with vendor and part_no, ie price depnds
on both Part_no AND which vendor
supplies it
INSS 651
15
Remove Partial Dependency
VENDOR _ADDRESS
VENDOR_name
UNIT_PRICE
PART#
Create TWO tables:
One with Partial dependency and other without it
S21 (Vendor_name, vendor_address)
S22(Vendor_name, Part_no, Unit_price)
INSS 651
16
3rd NF
A relation is in 3rd NF if it is in 2nd NF and it
does not contain any transitive
dependency
Transitive dependency: A transitive
dependency exists when some of the nonkey attributes are dependent on other nonkey attributes
INSS 651
17
So far we have three relations that are in
at least 2nd NF
S1 (Part_no, Part_DESC)
S21 (Vendor_name, vendor_address)
S22(Vendor_name, Part_no, Unit_price)
S1, S21 & S22 are also in 3rd NF since there
is ONLY ONE non_key attribute and
transitive dependency can NOT exist
INSS 651
18
ERD
PART
PART-SUPPLIED
INSS 651
VENDOR
19
Q6/p 175
A Using notation from the book
a. (C1, C3)- C2,C4,C5
(i.e., C2, C4, C5) are functionally dependent on C1 and C3
Above relation is in at least 1stNF, since there are No
repeating groups
C1C2 there is PARTIAL dependency since C2 depends
on PART of the PK and the whole PK
C4--C5 (transitive dependency since C5 ( a non-PK
attribute) depends on another non-PK attribute (C4)
INSS 651
20
Part b
Table 1
C1
Primary key: C1
Foreign key: None
Normal form: 3NF
C2
Table 2
C1
C3
C4
C5
Primary key: C1 + C3
Foreign key: C1 (to Table 1)
Normal form: 2NF, because the
table exhibits the transitive
dependencies C4
C5
INSS 651
21
Part c
C1
C1
C3
C4
C2
Table 1
Primary key: C1
Foreign key: None
Normal form: 3NF
C4
Table 2
Primary key: C1 + C3
Foreign key: C1 (to Table 1)
C4 (to Table 3)
Normal form: 3NF
C5
Table 3
Primary key: C4
Foreign key: None
Normal form: 3NF
INSS 651
22
Q8/P177
Table P5.8 Sample ITEM Records
Sample Value
Sample Value
Sample Value
ITEM_ID
231134-678
342245-225
254668-449
ITEM_LABEL
HP DeskJet 895Cse
HP Toner
DT Scanner
ROOM_NUMBER
325
325
123
BLDG_CODE
NTC
NTC
CSF
BLDG_NAME
Nottooclear
Nottoclear
Canseefar
BLDG_MANAGER
I. B. Rightonit
I. B. Rightonit
May B. Next
Attribute Name
INSS 651
23
Problem 8 Solution
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE BLDG_NAME BLDG_MANAGER
INSS 651
24
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE BLDG_NAME BLDG_MANAGER
Problem 9 Solution: All tables in 3NF
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE
BLDG_CODE BLDG_NAME
EMP_CODE
EMP_CODE
EMP_LNAME EMP_FNAME
INSS 651
EMP_INITIAL
25
1
EMPLOYEE
(0,N)
manages
M
(1,1)
M
1
BUILDING
contains
(0,N)
EMPLOYEE
EMP_CODE
EMP_INITIAL
(1,1)
BUILDING
1
BLDG_CODE
EMP_LNAME
EMP_FNAME
ITEM
M
ITEM
1
ITEM_ID
BLDG_NAME
ITEM_DESCRIPTION
EMP_CODE
ITEM_ROOM
M
BLDG_CODE
INSS 651
26
INSS 651
27
Download