Lecture Note 3

advertisement

ITEC 3220M

Using and Designing Database Systems

Instructor: Prof. Z. Yang

Course Website: http://people.yorku.ca/~zyang/itec

3220m.htm

Office: Tel 3049

Supertypes and Subtypes

• Generalization hierarchy: depicts relationships between higher-level supertype and lower-level subtype entities

• Supertype: contains the shared attributes

• Subtype: contains the unique attributes

• Inheritance:

– Subtype entities inherit values of all attributes of the supertype

– An instance of a subtype is also an instance of the supertype

2

Supertypes and Subtypes (Cont’d)

Supertype/ subtype relationships

Attributes shared by all entities

General entity type

SUPERTYPE

SUBTYPE1

Attributes unique to subtype1

SUBTYPE2

Attributes unique to subtype2

And so forth

Specialized version of supertype

3

Supertypes and Subtypes (Cont’d)

• Disjoint relationships

– Unique subtypes

– Non-overlapping

– Indicated with a ‘G’

• Overlapping subtypes

– An instance of the supertype could be more than one of the subtypes

– Indicated with a ‘Gs’

4

Generalization Hierarchy with

Overlapping Subtypes

5

Chapter 5

Logical Database Design and

Normalization of Database Tables

In this chapter, you will learn:

• How to transform ERD into relations

• What normalization is and what role it plays in database design

• About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF

• How normal forms can be transformed from lower normal forms to higher normal forms

• Normalization and E-R modeling are used concurrently to produce a good database design

• Some situations require denormalization to generate information efficiently

7

Transforming ERD into Relations

• Step one: Map regular entities

– Each regular entity type in an ER diagram is transformed into a relation

– The name given to the relation is generally the same as the entity type

– Each simple attribute of the entity type becomes an attribute of the relation and the identifier of entity becomes the primary key of the corresponding relation

8

Example

STUDENT

Student_ID Student_Name Other_Attributes

9

Transforming ERD into Relations

(Cont’d)

• Step two: Map weak entities

– Create a new relation and include all of the simple attributes as the attributes of this relation. Then include the primary key of the identifying relation as a foreign key attribute in this new relation. The primary key of the new relation is the combination of this primary key of the identifying relation and the partial identifier of the weak entity type.

10

Example

Employee_ID Employee_Name

Dependent_Name

Date_of_birth

EMPLOYEE

EMPLOYEE

Employee_ID

Has

Employee_Name

DEPENDENT

Dependent_Name Employee_ID

DEPENDENT

Date_of_birth

Gender

Gender

11

Transforming ERD into Relations

(Cont’d)

• Step three: Map binary relationship

– Map Binary one-to-many relations

•First create a relation for each of the two entity types participating in the relationship, using the procedure described in step one.

•Next, include the primary key attribute of the entity on the one-side of the relationship as a foreign key in the relation that is on the many-side of the relationship.

12

Example

Customer_ID Customer_Name

Order_ID Order_Date

1

M

Customer Submits Order

(0,N)

Customer

Customer_ID Customer_Name

(1,1)

Order

Order_ID Order_Date Customer_ID

13

Transforming ERD into Relations

(Cont’d)

• Step three: Map binary relationship

(Cont’d)

– Map binary one-to-one relationships

•First, two relationships are created one for each of the participating entity types.

•Second, the primary key of one of the relations is included as a foreign key in the other relation.

14

Example

Nurse_ID Nurse_Name

Centre_Name Location

Nurse

1

(0,1)

In_charge

1

(1,1)

Care

Centre

Nurse Nurse_ID

Care

Centre

Centre_Name

Nurse_Name

Location Nurse_in_c harge

15

Transforming ERD into Relations

(Cont’d)

• Step Four: Map composite Entities

– First step

• Create three relations: one for each of the two participating entities, and the third for the composite entity. We refer to the relation formed from the composite entity as the composite relation

– Second step

• Identifier not assigned: The default primary key for the composite relation consists of the two primary key attributes from the other two relations.

• Identifier assigned: The primary key for the composite relation is the identifier. The primary keys for the two participating entity types are included as foreign keys in the composite relation.

16

Example

Order_ID

Product_ID

Description

Order

1

M

(1,N)

(1,1)

Order

Line

M

1

Product

(1,1)

(0,N)

Order_Date

Quantity

Price

17

Example

Order

Order_ID Order_Date

Order Line

Product_ID Order_ID Quantity

Product

Product_ID Description Standard_Price

18

Example

Customer_ID

Name

Date

Customer

1 M

Shipment

Vendor_ID

M 1

Address

Vendor

Customer

Shipment_No

Customer_ID

Amount

Customer_Name

Shipment Shipment_N o

Vendor_I

D

Customer_I

D

Date Amount

Vendor Vendor_ID Address 19

Transforming ERD into Relations

(Cont’d)

• Step Five: Map unary relationship

– Map unary one-to-may relationship

•The entity type in the unary relationship is mapped to a relation using the procedure described in Step one. Then a foreign key attribute is added within the same relation that references the primary key values. A recursive foreign key is a foreign key in a relation that references the primary key values of that same relation.

20

Employee_ID

Example

Name

Birthdate

Employee

M (1,1)

1 (0,N)

Manages

Employee

Employee_ID Name Birthdate Manager_ID

21

Transforming ERD into Relations

(Cont’d)

• Step six: Map ternary relationship

– Convert a ternary relationship to a composite entity

– To map a composite entity that links three regular entities, we create a new composite relation. The default primary key of their relation consists of the three primary key attributes for the participating entities. Any attributes of the composite entity become attributes of the new relation

22

Example

Patient_ID

Patient_Name Physician_ID

Physician_Name

Patient

1

Results

M

(0,N)

(1,1)

(0,N)

(1,1)

M

Patient

Treatment

Physician

1

Date

Time

M (1,1)

Treatment_

Code

1 (0,N)

Treatment

Description

23

Example

Patient

Patient_ID Patient_Name

Physician

Physician_ID Physician_Name

Patient Treatment

Patient_

ID

Treatment

Physician_I

D

Treatment_C ode

Date Time Result

Treatment_Code Description

24

Transforming ERD into Relations

(Cont’d)

• Step seven: Map supertype/subtype relationships

1. Create a separate relation for the supertype and for each of its subtype

2. Assign to the relation created for the supertype the attributes that are common to all members of the supertype, including the primary key

3. Assign to the relation for each subtype the primary key of the supertype, and only those attributes that are unique to that subtype

4. Assign one attribute of the supertype to function as the subtype discriminator

25

Example

Employee_Name

Address

Employee_Type

Employee_Number

Hourly

Employee

Employee

Gs

Salaried

Employee

Date_Hired

Hourly_Rate Annual_Salary Stock_Option

26

Example

Employee

Employee_Nu mber

Employee_N ame

Hourly_Employee

H_Employee_Number

Address

Hourly_Rate

Employee_Ty pe

Date_Hi red

Salaried_Employee

S_Employee_Number Annual_Salary Stock_Option

27

Database Tables and

Normalization

• Table is the basic building block in database design

• Normalization is the process for assigning attributes to entities

– Reduces data redundancies

– Helps eliminate data anomalies

– Produces controlled redundancies to link tables

• Normalization stages

– 1NF - First normal form

– 2NF - Second normal form

– 3NF - Third normal form

– 4NF - Fourth normal form

28

Need for Normalization

29

Anomalies In the Table

• PRO_NUM intended to be primary key

• Table displays data anomalies

– Update

• Modifying JOB_CLASS

– Insertion

• New employee must be assigned project

– Deletion

• If employee deleted, other vital data lost

30

Conversion to First Normal Form

• Repeating group

– Derives its name from the fact that a group of multiple entries of same type can exist for any single key attribute occurrence

• Relational table must not contain repeating groups

• Normalizing table structure will reduce data redundancies

• Normalization is three-step procedure

31

Conversion to First Normal Form

(continued)

• Step 1: Eliminate the Repeating Groups

– Present data in tabular format, where each cell has single value and there are no repeating groups

– Eliminate repeating groups, eliminate nulls by making sure that each repeating group attribute contains an appropriate data value

32

Conversion to First Normal Form

(continued)

33

Conversion to First Normal Form

(continued)

• Step 2: Identify the Primary Key

– Primary key must uniquely identify attribute value

– New key must be composed

34

Conversion to First Normal Form

(continued)

• Step 3: Identify all dependencies

– Dependencies can be depicted with help of a diagram

– Dependency diagram:

•Depicts all dependencies found within given table structure

•Helpful in getting bird’s-eye view of all relationships among table’s attributes

•Makes it less likely that will overlook an important dependency

35

Conversion to First Normal Form

(continued)

36

Conversion to First Normal Form

(continued)

• First normal form describes tabular format in which:

– All key attributes are defined

– There are no repeating groups in the table

– All attributes are dependent on primary key

• All relational tables satisfy 1NF requirements

• Some tables contain partial dependencies

– Dependencies based on only part of the primary key

– Sometimes used for performance reasons, but should be used with caution

– Still subject to data redundancies

37

Conversion to Second Normal

Form

• Relational database design can be improved by converting the database into second normal form (2NF)

• Two steps

38

Conversion to Second Normal

Form (continued)

• Step 1: Write Each Key Component on a Separate Line

– Write each key component on separate line, then write original (composite) key on last line

– Each component will become key in new table

39

Conversion to Second Normal

Form (continued)

• Step 2: Assign Corresponding

Dependent Attributes

– Determine those attributes that are dependent on other attributes

– At this point, most anomalies have been eliminated

40

Conversion to Second Normal Form

(continued)

41

Conversion to Second Normal

Form (continued)

• Table is in second normal form (2NF) when:

– It is in 1NF and

– It includes no partial dependencies:

•No attribute is dependent on only portion of primary key

42

Conversion to Third Normal Form

• Data anomalies created are easily eliminated by completing three steps

• Step 1: Identify Each New Determinant

– For every transitive dependency, write its determinant as PK for new table

•Determinant

– Any attribute whose value determines other values within a row

43

Conversion to Third Normal Form

(continued)

• Step 2: Identify the Dependent

Attributes

– Identify attributes dependent on each determinant identified in Step 1 and identify dependency

– Name table to reflect its contents and function

44

Conversion to Third Normal Form

(continued)

• Step 3: Remove the Dependent Attributes from

Transitive Dependencies

– Eliminate all dependent attributes in transitive relationship(s) from each of the tables that have such a transitive relationship

– Draw new dependency diagram to show all tables defined in Steps 1–3

– Check new tables as well as tables modified in

Step 3 to make sure that each table has determinant and that no table contains inappropriate dependencies

45

Conversion to Third Normal Form

(continued)

46

Conversion to Third Normal Form

(continued)

• A table is in third normal form (3NF) when both of the following are true:

– It is in 2NF

– It contains no transitive dependencies

47

The Boyce-Codd Normal Form

(BCNF)

• Every determinant in table is a candidate key

– Has same characteristics as primary key, but for some reason, not chosen to be primary key

• When table contains only one candidate key, the

3NF and the BCNF are equivalent

• BCNF can be violated only when table contains more than one candidate key

48

The Boyce-Codd Normal Form

(BCNF) (continued)

• Most designers consider the BCNF as special case of 3NF

• Table is in 3NF when it is in 2NF and there are no transitive dependencies

• Table can be in 3NF and fails to meet

BCNF

– No partial dependencies, nor does it contain transitive dependencies

– A nonkey attribute is the determinant of a key attribute

49

The Boyce-Codd Normal Form

(BCNF) (continued)

50

The Boyce-Codd Normal Form

(BCNF) (continued)

51

An Example

GRADE( Student_ID, Student_Name, Address, Major,

Course_ID, Course_Title, Instructor_Name,

Instructor_Office, Grade)

52

Normalization and Database

Design

• Normalization should be part of the design process

• E-R Diagram provides macro view

• Normalization provides micro view of entities

– Focuses on characteristics of specific entities

– May yield additional entities

• Difficult to separate normalization from E-R diagramming

• Business rules must be determined

53

Higher-Level Normal Forms

• Fourth Normal Form (4NF)

– Table is in 3NF

– Has no multiple sets of multivalued dependencies

54

Conversion to 4NF

Stud-ID

1126

1126

1126

1126

1126

Stud-ID

1126

1126

1126

Stud-ID Course

1126 1212F

1126

1126

1620F

1320F

Course

1212F

1620F

1320F

Course

1212F

1620F

1320F

Service

Red Cross

United Way

Service

Red Cross

United Way

Service

Red Cross

United Way

Multivalued Dependencies

Stud-ID

1126

1126

1126

Stud-ID

1126

1126

Course

1212F

1620F

1320F

Service

Red Cross

United Way

Set of Tables in 4NF

55

Denormalization

• Normalization is one of many database design goals

• Normalized table requires

– Additional processing

– Loss of system speed

• Normalization purity is difficult to sustain due to conflict in:

– Design efficiency

– Information requirements

– Processing

56

Exercise

Part Supplier Data

Part_

No

Description Vendor_Nam e

1234 Logic Chips Fast Chips

Smart Chips

Address Unit_C ost

Cupertino

Phoenix

10.00

8.00

5678 Memory Chips Fast Chips

Quality Chips

Smart Chips

Cupertino

Austin

Phoenix

3.00

2.00

5.00

57

Exercise(Cont’d)

• Convert the table to a relation in first normal form

(Named Part Supplier)

• List the functional dependency in the Part Supplier and identify a candidate key

• For the relation Part Supplier, identify the followings: an insert anomaly, a delete anomaly, and a modification anomaly.

• Draw a relation schema and show the functional dependencies

• Develop a set of 3NF relations from Part Supplier

58

Download