- Aatif Kamal

advertisement
Chapter 5:
Logical Database Design and the
Relational Model
Modern Database Management
8th Edition
Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden
By: Aatif Kamal
Dated: March 2008
1
Objectives








Definition of terms
List five properties of relations
State two properties of candidate keys
Define first, second, and third normal form
Describe problems from merging relations
Transform E-R and EER diagrams to relations
Create tables with entity and relational integrity
constraints
Use normalization to convert anomalous tables to
well-structured relations
Chapter 5
INFORMAL DEFINITIONS

Relation is
NOT same as
Relationship
RELATION: A table of values






A relation may be thought of as a set of rows.
A relation may alternately be though of as a set of
columns.
Each row represents a fact that corresponds to a real-world
entity or relationship.
Each row has a value of an item or set of items that
uniquely identifies that row in the table.
Sometimes row-ids or sequential numbers are assigned to
identify the rows in the table.
Each column typically is called by its column name or
column header or attribute name.
Chapter 5
3
FORMAL DEFINITIONS
A Relation may be defined in multiple ways.
 The Schema of a Relation:



R (A1, A2, .....An)
Relation schema R is defined over attributes A1,
A2, .....An
For Example –
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
Here, CUSTOMER is a relation defined over the four
attributes Cust-id, Cust-name, Address, Phone#, each of
which has a domain or a set of valid values. For example,
the domain of Cust-id is 6 digit numbers.
Chapter 5
4
FORMAL DEFINITIONS



A tuple is an ordered set of values
Each value is derived from an appropriate domain.
Each row in the CUSTOMER table may be referred to
as a tuple in the table and would consist of four values.

<632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
is a tuple belonging to the CUSTOMER relation.


A relation may be regarded as a set of tuples (rows).
Columns in a table are also called attributes of the
relation.
Chapter 5
5
Relation



Definition: A relation is a named, two-dimensional table
of data
Table consists of rows (records) and columns (attribute
or field)
Requirements for a table to qualify as a relation:






It must have a unique name
Every attribute value must be atomic (not multi-valued, not
composite)
Every row/tuple must be unique (can’t have two rows with
exactly the same values for all their fields)
Attributes (columns) in tables must have unique names
The order of the columns must be irrelevant
The order of the rows must be irrelevant
NOTE: all relations are in 1st Normal form
Chapter 5
6
Correspondence with E-R Model




Relations (tables) correspond with entity types and
with many-to-many relationship types
Rows correspond with entity instances and with manyto-many relationship instances
Columns correspond with attributes
NOTE: The word relation (in relational database) is
NOT the same as the word relationship (in E-R
model)
Chapter 5
7
DEFINITION SUMMARY
Informal Terms
Formal Terms
Table
Relation
Column
Attribute/Domain
Row
Tuple
Values in a column
Domain
Table Definition
Schema of a
Relation
Extension
Populated Table
8Chapter
5
Relation Example
Chapter
5
9
Key Fields

Keys are special fields that serve two main purposes:




Primary keys are unique identifiers of the relation in
question. Examples include employee numbers, social
security numbers, etc. This is how we can guarantee that
all rows are unique
Foreign keys are identifiers that enable a dependent relation
(on the many side of a relationship) to refer to its parent
relation (on the one side of the relationship)
Keys can be simple (a single field) or composite
(more than one field)
Keys usually are used as indexes to speed up the
response to user queries
Chapter 5
10
Figure 5-3 Schema for four relations (Pine Valley Furniture Company)
Primary Key
Foreign Key
(implements 1:N relationship
between customer and order)
Combined, these are a composite
primary key (uniquely identifies the
order line)…individually they are
foreign keys (implement M:N
relationship between order and product)
11
Integrity Constraints

Domain Constraints


Entity Integrity


Allowable values for an attribute.
No primary key attribute may be null. All primary key fields
MUST have data
Action Assertions

Business rules.
Chapter 5
12
Domain definitions enforce domain integrity constraints
13
Integrity Constraints

Referential Integrity – rule states that any foreign key
value (on the relation of the many side) MUST match a
primary key value in the relation of the one side. (Or the
foreign key can be null)
 For example: Delete Rules



Restrict: don’t allow delete of “parent” side if related rows
exist in “dependent” side
Cascade–automatically: delete “dependent” side rows that
correspond with the “parent” side row to be deleted
Set-to-Null: set the foreign key in the dependent side to null if
deleting from the parent side  not allowed for weak entities
Chapter 5
14
Figure 5-5
Referential integrity constraints (Pine Valley Furniture)
Referential
integrity
constraints are
drawn via arrows
from dependent to
parent table
15
Figure 5-6 SQL table definitions
Referential
integrity
constraints are
implemented with
foreign key to
primary key
references
16
Transforming EER Diagrams into Relations
Mapping Regular Entities to Relations
1.
2.
3.
Chapter 5
Simple attributes: E-R attributes map directly
onto the relation
Composite attributes: Use only their simple,
component attributes
Multivalued Attribute: Becomes a separate
relation with a foreign key taken from the
superior entity
17
Figure 5-8 Mapping a regular entity
(a) CUSTOMER
entity type with
simple
attributes
(b) CUSTOMER relation
18
Figure 5-9 Mapping a composite attribute
(a) CUSTOMER
entity type with
composite
attribute
(b) CUSTOMER relation with address detail
19
Figure 5-10 Mapping an entity with a multivalued attribute
(a)
Multivalued attribute becomes a separate relation with foreign key
(b)
One–to–many relationship between original entity and new relation
Chapter 5
20
Transforming EER Diagrams into Relations
(cont….)
Mapping Weak Entities
Becomes a separate relation with a foreign
key taken from the superior entity
 Primary key composed of:

Partial identifier of weak entity
 Primary key of identifying relation (strong entity)

Chapter 5
21
Figure 5-11 Example of mapping a weak entity
a) Weak entity DEPENDENT
22
Figure 5-11 Example of mapping a weak entity (cont.)
b) Relations resulting from weak entity
NOTE: the domain constraint
for the foreign key should
NOT allow null value if
DEPENDENT is a weak
entity
Foreign key
Composite primary key
23
Transforming EER Diagrams into Relations
(cont.)
Mapping Binary Relationships



One-to-Many: Primary key on the one side becomes a
foreign key on the many side
Many-to-Many: Create a new relation with the primary keys
of the two entities as its primary key
One-to-One: Primary key on the mandatory side becomes a
foreign key on the optional side
Chapter 5
24
Figure 5-12 Example of mapping a 1:M relationship
a) Relationship between customers and orders
Note the mandatory one
b) Mapping the relationship
Foreign key
25
Again, no null value in the
foreign key…this is because
of the mandatory minimum
cardinality
Figure 5-13 Example of mapping an M:N relationship
a) Completes relationship (M:N)
The Completes relationship will need to become a separate relation
26
Figure 5-13 Example of mapping an M:N relationship (cont.)
b) Three resulting relations
Composite primary key
Foreign key
Foreign key
27
New
intersection
relation
Figure 5-14 Example of mapping a binary 1:1 relationship
a) In_charge relationship (1:1)
Often in 1:1 relationships, one direction is optional.
28
Figure 5-14 Example of mapping a binary 1:1 relationship (cont.)
b) Resulting relations
Foreign key goes in the relation on the optional side,
Matching the primary key on the mandatory side
29
Transforming EER Diagrams into Relations
(cont.)
Mapping Associative Entities

Identifier Not Assigned


Default primary key for the association relation is
composed of the primary keys of the two entities
(as in M:N relationship)
Identifier Assigned
It is natural and familiar to end-users
 Default identifier may not be unique

Chapter 5
30
Figure 5-15 Example of mapping an associative entity
a) An associative entity
31
Figure 5-15 Example of mapping an associative entity (cont.)
b) Three resulting relations
Composite primary key formed from the two foreign keys
32
Figure 5-16 Example of mapping an associative entity with
an identifier
a) SHIPMENT associative entity
33
Figure 5-16 Example of mapping an associative entity with
an identifier (cont.)
b) Three resulting relations
Primary key differs from foreign keys
34
Transforming EER Diagrams into Relations
(cont.)
Mapping Unary Relationships


One-to-Many: Recursive foreign key in the same relation
Many-to-Many: Two relations:
One for the entity type
 One for an associative relation in which the
primary key has two attributes, both taken from
the primary key of the entity

Chapter 5
35
Figure 5-17 Mapping a unary 1:N relationship
(a) EMPLOYEE entity with
unary relationship
(b) EMPLOYEE
relation with
recursive foreign
key
36
Figure 5-18 Mapping a unary M:N relationship
(a) Bill-of-materials
relationships (M:N)
(b) ITEM and
COMPONENT
relations
Chapter 5
37
Transforming EER Diagrams into Relations
(cont…)
Mapping Ternary (and n-ary) Relationships
 One relation for each entity and one for
the associative entity
 Associative entity has foreign keys to
each entity in the relationship
Chapter 5
38
Figure 5-19 Mapping a ternary relationship
a) PATIENT TREATMENT Ternary relationship with
associative entity
39
Figure 5-19 Mapping a ternary relationship (cont.)
b) Mapping the ternary relationship PATIENT TREATMENT
Remember
that the
primary key
MUST be
unique
40
This is why
treatment date
and time are
included in the
composite
primary key
But this makes a
very
cumbersome
key…
It would be
better to create a
surrogate key
like Treatment#
Transforming EER Diagrams into
Relations (cont.)
Mapping Supertype/Subtype Relationships
 One relation for supertype and for each subtype
 Supertype attributes (including identifier and
subtype discriminator) go into supertype relation
 Subtype attributes go into each subtype; primary
key of supertype relation also becomes primary
key of subtype relation
 1:1 relationship established between supertype
and each subtype, with supertype as primary table
Chapter 5
41
Mapping EER Model Constructs to
Relations

Options for Mapping Specialization or Generalization.
Convert each specialization with m subclasses {S1, S2,….,Sm} and
generalized superclass C, where the attributes of C are {k,a1,…an}
and k is the (primary) key, into relational schemas using one of the
four following options:
Option A: Multiple relations-Superclass and subclasses.
Create a relation L for C with attributes Attrs(L) = {k,a1,…an} and PK(L) = k.
Create a relation Li for each subclass Si, 1 < i < m, with the attributesAttrs(Li) =
{k} U {attributes of Si} and PK(Li)=k.This option works for any specialization
(total or partial, disjoint or over-lapping).
Option B: Multiple relations-Subclass relations only
Create a relation Li for each subclass Si, 1 < i < m, with the attributes Attr(Li) =
{attributes of Si} U {k,a1…,an} and PK(Li) = k. This option only works for a
specialization whose subclasses are total (every entity in the superclass must
belong to (at least) one of the subclasses).
Chapter 5
42
EER diagram notation for an attribute-defined specialization on
JobType.
Chapter 5
43
Generalization. (b) Generalizing CAR and TRUCK into
the superclass VEHICLE.
Chapter 5
44
Mapping EER Model Constructs to
Relations (cont)
Option C: Single relation with one type attribute.
Create a single relation L with attributes Attrs(L) = {k,a1,…an} U {attributes of
S1} U…U {attributes of Sm} U {t} and PK(L) = k. The attribute t is called a type
(or discriminating) attribute that indicates the subclass to which each tuple
belongs: Works for Disjoint
Option D: Single relation with multiple type attributes.
Create a single relation schema L with attributes Attrs(L) = {k,a1,…an} U
{attributes of S1} U…U {attributes of Sm} U {t1, t2,…,tm} and PK(L) = k. Each ti, 1
< I < m, is a Boolean type attribute indicating whether a tuple belongs to the
subclass Si. Works for overlapping as well
Chapter 5
45
EER diagram notation for an attribute-defined specialization
on JobType.
Chapter 5
46
EER diagram notation for an overlapping (nondisjoint)
specialization.
Chapter 5
47
Mapping EER Model Constructs to Relations
(cont)

Mapping of Shared Subclasses (Multiple Inheritance)
A shared subclass, such as STUDENT_ASSISTANT, is a
subclass of several classes, indicating multiple inheritance. These
classes must all have the same key attribute; otherwise, the
shared subclass would be modeled as a category.
We can apply any of the options discussed in for EER-inhertance
to a shared subclass, subject to the restriction discussed
mapping algorithm. Below both options C and D are used for
the shared class STUDENT_ASSISTANT.
Chapter 5
48
A
specialization
lattice with
multiple
inheritance
for a
UNIVERSITY
database.
Chapter 5
49
Mapping the EER specialization lattice in
Figure 4.6 using multiple options.
Chapter 5
50
Mapping EER Model Constructs to
Relations (cont)
 Mapping


of Union Types (Categories).
For mapping a category whose defining superclass have
different keys, it is customary to specify a new key attribute,
called a surrogate key, when creating a relation to
correspond to the category.
In the example below we can create a relation OWNER to
correspond to the OWNER category and include any
attributes of the category in this relation.
The primary key of the OWNER relation is the surrogate
key, which we called OwnerId.
Chapter 5
51
Two categories
(union types):
OWNER and
REGISTERED_V
EHICLE.
Chapter 5
52
Mapping the
EER categories
(union types)
in to relations.
Chapter 5
53
Mapping Exercise
Exercise
FIGURE
An ER schema for a SHIP_TRACKING database.
Chapter 5
54
Chapter Summary

ER-to-Relational Mapping Algorithm








Step 1: Mapping of Regular Entity Types
Step 2: Mapping of Weak Entity Types
Step 3: Mapping of Binary 1:1 Relation Types
Step 4: Mapping of Binary 1:N Relationship Types.
Step 5: Mapping of Binary M:N Relationship Types.
Step 6: Mapping of Multivalued attributes.
Step 7: Mapping of N-ary Relationship Types.
Mapping EER Model Constructs to Relations


Step 8: Options for Mapping Specialization or Generalization.
Step 9: Mapping of Union Types (Categories).
Chapter 5
55
Data Normalization

Primarily a tool to validate and improve a logical
design so that it satisfies certain constraints that
avoid unnecessary duplication of
data

The process of decomposing relations with
anomalies to produce smaller, well-structured
relations
Chapter 5
56
Well-Structured Relations


A relation that contains minimal data redundancy
and allows users to insert, delete, and update rows
without causing data inconsistencies
Goal is to avoid anomalies



Insertion Anomaly: adding new rows forces user to
create duplicate data
Deletion Anomaly: deleting rows may cause a loss of
data that would be needed for other future rows
Modification Anomaly: changing data in a row forces
changes to other rows because of duplication
General rule of thumb: A table should not pertain to
more than one entity type
Chapter 5
57
Example–Figure 5-2b
Question–Is this a relation?
Answer–Yes: Unique rows and no
multivalued attributes
Question–What’s the primary key?
Answer–Composite: Emp_ID, Course_Title
Chapter 5
58
Anomalies in this Table



Insertion–can’t enter a new employee without having
the employee take a class
Deletion–if we remove employee 140, we lose
information about the existence of a Tax Acc class
Modification–giving a salary increase to employee 100
forces us to update multiple records
Why do these anomalies exist?
Because there are two themes (entity types) in this
one relation. This results in data duplication and an
unnecessary dependency between the entities
Chapter 5
59
Functional Dependencies and Keys


Functional Dependency: The value of one attribute
(the determinant) determines the value of
another attribute
Candidate Key:


A unique identifier. One of the candidate keys will
become the primary key
 E.g. perhaps there is both credit card number and SS#
in a table…in this case both are candidate keys
Each non-key field is functionally dependent on every
candidate key
Chapter 5
60
Figure 5.22 Steps in normalization
61
First Normal Form
No multivalued attributes
Every attribute value is atomic
Fig. 5-25 is not in 1st Normal Form
(multivalued attributes)  it is not a
relation
 Fig. 5-26 is in 1st Normal form
 All relations are in 1st Normal Form



Chapter 5
62
Table with multivalued attributes, not in 1st normal form
Note: this is NOT a relation
63
Table with no multivalued attributes and unique rows, in 1st
normal form
Note: this is relation, but not a well-structured one
64
Anomalies in this Table



Insertion–if new product is ordered for order 1007 of
existing customer, customer data must be re-entered,
causing duplication
Deletion–if we delete the Dining Table from Order 1006,
we lose information concerning this item's finish and price
Update–changing the price of product ID 4 requires
update in several records
1NF Anomalies
Why do these anomalies exist?
Because there are multiple themes (entity types) in
one relation. This results in duplication and an
unnecessary dependency between the entities
Chapter 5
65
Second Normal Form

1NF PLUS every non-key attribute is fully
functionally dependent on the ENTIRE
primary key


Every non-key attribute must be defined by the
entire key, not by only part of the key
No partial functional dependencies
Chapter 5
66
Figure 5-27 Functional dependency diagram for INVOICE
Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address
Customer_ID  Customer_Name, Customer_Address
Product_ID  Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID  Order_Quantity
Therefore, NOT in 2nd Normal Form
Chapter 5
67
Figure 5-28 Removing partial dependencies
Getting it into
Second Normal
Form
Partial dependencies are removed, but there
are still transitive dependencies
Chapter 5
68
Third Normal Form



2NF PLUS no transitive dependencies (functional
dependencies on non-primary-key attributes)
Note: This is called transitive, because the primary key
is a determinant for another attribute, which in turn is a
determinant for a third
Solution:
Non-key determinant with transitive dependencies go
into a new table; non-key determinant becomes
primary key in the new table and stays as foreign key
in the old table
Chapter 5
69
Figure 5-28 Removing partial dependencies
Getting it into
Third Normal
Form
Transitive dependencies are removed
Chapter 5
70
Merging Relations


View Integration–Combining entities from multiple
ER models into common relations
Issues to watch out for when merging entities from
different ER models:




Synonyms–two or more attributes with different names but
same meaning
Homonyms–attributes with same name but different
meanings
Transitive dependencies –even if relations are in 3NF prior
to merging, they may not be after merging
Supertype/subtype relationships –may be hidden prior to
merging
Chapter 5
71
Normalization – 1 to 3 NF revisited

Just another way of explaining normalizations rules
Chapter 5
72
First Normal Form

Disallows




composite attributes
multivalued attributes
nested relations; attributes whose values for an individual
tuple are non-atomic
Considered to be part of the definition of relation
Chapter 5
73
First Normal Form
Above two are examples of Multi-valued, non-atomic attributes
So for one worst case you might require 1000 columns in your
table…. Still not dynamic enough as what if 1001 person is hired?
Means we will forever be changing the structure of the database
table
Chapter 5
74
First normal form
• “A relation is in first normal form if every
attribute is Single valued for each tuple (record)”
• Each attribute in each row, or each “cell” of the
table, contains only one value.
• No repeating fields or groups are allowed.
The first rule dictates that we must not duplicate data within the same row
of a table.
Within the database community, this concept is referred to as the atomicity
of a table.
Tables that comply with this rule are said to be atomic.
Chapter
5
75
First normal form
Stu_id
Stu_Name
Major
Credits
Status
S1001
Smith
Math
90
Senior
S1002
Lee
CS, Math
15
Fresh
S1003
Jon
Art, English
63
Junior
Stu_i Stu_Nam Majo
d
e
r1
Major Credit Status
2
s
S1001 Smith
Math
Math
90
Senior
S1002 Lee
CS
Math
15
Fresh
S1003 Jon
Art
Englis 63
h one major).
Students table (students have more then
Junior
The rule of 1NF (every attribute is Single valued for each tuple) is violated
in the records of S1002 and S1003, who have two values listed for Major.
Chapter
5
76
Normalization into 1NF
Chapter 5
77
Normalization of nested relations into 1NF
Chapter 5
78
Second normal form
A relation is in second normal form (2NF) if and only if
1. It is in first normal form
2. All the non-key attributes are fully functionally dependent
on the key.
If the relation is 1NF and the key consist of single attribute,
then the relation is automatically in 2NF.
nonkey attributes = attributes that don’t appear in any
candidate key
79
Chapter
5
3.3 Second Normal Form (1)


Uses the concepts of FDs, primary key
Definitions



Prime attribute: An attribute that is member of the
primary key K
Full functional dependency: a FD Y  Z where removal
of any attribute from Y means the FD does not hold any
more
Examples:


{SSN, PNUMBER}  HOURS is a full FD since neither SSN
 HOURS nor PNUMBER  HOURS hold
{SSN, PNUMBER}  ENAME is not a full FD (it is called a
partial dependency ) since SSN  ENAME also holds
Chapter 5
80
Second Normal Form (2)

A relation schema R is in second normal form
(2NF) if every non-prime attribute A in R is fully
functionally dependent on the primary key

R can be decomposed into 2NF relations via the
process of 2NF normalization

If the relation is 1NF and the key consist of single
attribute, then the relation is automatically in 2NF.
nonkey attributes = attributes that don’t appear in any
candidate key
Chapter 5
81
Normalizing into 2NF and 3NF
Chapter 5
82
3.4 Third Normal Form (1)

Definition:


Transitive functional dependency: a FD X  Z that
can be derived from two FDs X  Y and Y  Z
Examples:

SSN  DMGRSSN is a transitive FD


Since SSN  DNUMBER and DNUMBER  DMGRSSN hold
SSN  ENAME is non-transitive

Since there is no set of attributes X where SSN  X and X 
ENAME
Chapter 5
83
Third Normal Form

A relation is in third normal form if


It is in 2NF
No non-key attribute is transitively dependent on the key
Chapter 5
84
Third Normal Form
Stu_id
Stu_Name
Credits
Status
S1001
Smith
90
Senior
S1002
Lee
15
Fresh
S1003
Jon
63
Junior
Transitive dependency (A  B  C)
Stu_id  Credits  Status
If Stu_id  Credits and Credits  Status
then we can write
Stu_id  Status
85
Chapter
5
Third Normal Form
Student
Stu_id
Status
Stu_Na
me
Credit
s
Credits
Status
15
Fresh
S1001
Smith
90
63
Junior
S1002
Lee
15
90
Senior
S1003
Jon
63
Transitive dependency is removed
Chapter
5
86
Third Normal Form (2)



A relation schema R is in third normal form
(3NF) if it is in 2NF and no non-prime attribute A
in R is transitively dependent on the primary key
R can be decomposed into 3NF relations via the
process of 3NF normalization
NOTE:



In X  Y and Y  Z, with X as the primary key, we consider
this a problem only if Y is not a candidate key.
When Y is a candidate key, there is no problem with the
transitive dependency .
E.g., Consider EMP (SSN, Emp#, Salary ).

Here, SSN  Emp#  Salary and Emp# is a candidate key.
Chapter 5
87
Normal Forms Defined Informally

1st normal form


2nd normal form


All attributes depend on the key -atomicity
All attributes depend on the whole key -
3rd normal form

All attributes depend on nothing but the key – no
transitivity
Chapter 5
88
Successive Normalization of LOTS into 2NF and 3NF
Chapter 5
89
SUMMARY OF NORMAL FORMS
based on Primary Keys
Chapter 5
90
Enterprise Keys
 Primary
keys that are unique in the whole
database, not just within a single relation
 Corresponds with the concept of an
object ID in object-oriented systems
Chapter 5
91
Figure 5-31 Enterprise keys
a) Relations with
enterprise key
b) Sample data with
enterprise key
92
Download