The Entity-Relationship - Department of Computer Science | UNC

advertisement
The Entity-Relationship (ER) Model
Chpt 2
Instructor: Jianping Fan
http://www.cs.uncc.edu/~jfan
Database Management Systems
Raghu Ramakrishnan
1
Overview of Database Design
a. What is database?
---database consists of many tables and
their inter-table relationships
---each table has some “similar” tuples
b. What database should address?
---tables and their relationships
---submission & processing of queries
3 key issues: Description, Organization & Search
Database Management Systems
Raghu Ramakrishnan
2
Example: University Database
Tables:
Students: SID, sname, year, GPA
Departments: DID, dname, office
Faculties: ssn, fname, f-office, phone, salary
Courses: CID, cname, time, room, credit-hour
Inter-Table Relationships:
Students enroll in Courses, Faculties teach Courses
Faculties work for Departments
Query on multiple tables is allowed!
Database Management Systems
Raghu Ramakrishnan
3
Overview of Database Design
Requirement
Analysis
ER design
Relational
Schema
DBMS
Requirement analysis: analysis of data and users’
requirement to entity sets and relationship sets
 Conceptual database design: ER model for description
 Logical database design: ER model to relational database
Beyond ER model
 Schema refinement: analyze the collections of relations
 Physical database design: database indexing
 Security design: access control

Database Management Systems
Raghu Ramakrishnan
4
Overview of Database Design
Requirement
Analysis

ER design
Relational
Schema
DBMS
Requirement analysis: analysis of data and users’
requirement to entity sets and relationship sets
a.
b.
c.
d.
e.
What kind of attributes should be included for tuple description?
Which attributes should be indexed? Frequent operations & access
Long schema or short schema (balance of storage & access efficiency)?
Table integration or separation? Updating frequency
Query optimization framework?
Database Management Systems
Raghu Ramakrishnan
5
Overview of Database Design

Conceptual design: (ER Model is used at this stage.)
a. Should a concept be modeled as an entity or an attribute?
b. Should a concept be modeled as an entity or a relationship?
c. What are relationship sets and their participating entity sets?
d. Should we use binary or ternary relationships?
e. Should we use aggregation?
Database Management Systems
Raghu Ramakrishnan
6
Overview of Database Design
Requirement
Analysis

ER design
Relational
Schema
DBMS
Logical database design: ER model to relational database
a. How to create physical database tables?
b. How to transform E-R models (conceptual database design)
into physical database tables?
a. How generate physical database indexing for fast query?
Database Management Systems
Raghu Ramakrishnan
7
Overview of Database Design
Requirement
Analysis

ER design
Schema refinement:
Relational
Schema
DBMS
analyze the collections of relations
a. Which tables should be separated into multiple smaller tables?
b. Which tables can be integrated as one single larger table?
c. Which attributes should be inserted into existing schema?
d. Which attributes can be deleted from existing schema?
Database Management Systems
Raghu Ramakrishnan
8
Overview of Database Design
Requirement
Analysis

ER design
Relational
Schema
DBMS
Physical database design: database indexing
a. Which attributes should be selected for indexing?
Most frequently-used attributes for query formulation
b. What kind of indexing structures should be selected?
range search or equal search?
Database Management Systems
Raghu Ramakrishnan
9
Overview of Database Design
Requirement
Analysis

ER design
Relational
Schema
DBMS
Security design: access control
Who can access what in database under which conditions?
Database Management Systems
Raghu Ramakrishnan
10
Overview of Database Design

Conceptual design: (ER Model is used at this stage.)
– What are the entities and relationships in the enterprise?
– What information about these entities and relationships
should we store in the database (i.e., attributes)?
– What are the integrity constraints or business rules that
hold?
– A database `schema’ in the ER Model can be represented
pictorially (ER diagrams).
– Can map an ER diagram into a relational schema.
Database Management Systems
Raghu Ramakrishnan
11
University Database

University database contains employees and
departments which are described by certain
attributes

University database also contains relationships
between employees departments which are also
described by certain attributes
Database Management Systems
Raghu Ramakrishnan
12
ER Model Basics
ssn
name
lot
Employees

Entity: Real-world object distinguishable from other
objects. An entity is described (in DB) using a
set of attributes.
ID
name
year
GPA
999-80-3267, John Smith, 2003, 3.5
999-32-0847, James Gary, 2006, 3.0
Database Management Systems
Raghu Ramakrishnan
13
ER Model Basics
ssn
name
lot
Employees

Entity Set: A collection of similar entities. E.g., all
employees.
– All entities in an entity set have the same set of attributes.
– Each entity set has a key.
– Each attribute has a domain.
What’s the key? How many keys one object can have?
Database Management Systems
Raghu Ramakrishnan
14
Entity, Entity Set, Attribute, Schema & Domain
ID or SSN
999-38-4431
Name
Year
John Smith
999-28-3341 Miki Jordan
1999
2000
331-43-4567
David Kim
2000
535-34-5678
Paul Lee
1998
Database Management Systems
Raghu Ramakrishnan
Age
21
GPA
3.68
28
3.45
25
26
4.00
3.89
15
ER Model Basics (Contd.)
since
name
ssn
dname
lot
Employees
budget
did
Works_In
Departments
Relationship Set
Relationship: Association among two or more entities.
 Examples:

Fan works in Computer Science Department.
Smith work in Electronic Engineering Department
Database Management Systems
Raghu Ramakrishnan
16
ER Model Basics (Contd.)
since
name
ssn
dname
lot
Employees
budget
did
Works_In
Departments
Relationship Set

Relationship Set: Collection of similar relationships.
– An n-ary relationship set R relates n entity sets E1 ... En;
each relationship in R involves entities e1 E1, ..., en En
 Same entity set could participate in different
relationship sets, or in different “roles” in same set.
Database Management Systems
Raghu Ramakrishnan
17
Entity vs. Entity Set
Student --- Students
John Smith
(999-21-3415, jsmith@, John Smith, 18, 3.5)
Students in ITCS3160
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Database Management Systems
Raghu Ramakrishnan
18
Entity Keys
Primary key
Candidate key
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Database Management Systems
Raghu Ramakrishnan
19
Relationship vs. Relationship Set
John Smith
(999-21-3415, jsmith@, John Smith, 18, 3.5)
Relationship
ITCS3160
(3160, ITCS, DBMS, J. Fan, 3, Kenn. 236)
Database Management Systems
Raghu Ramakrishnan
20
Relationship vs. Relationship Set
999-21-3415, jsmith@, John Smith, 18, 3.5
999-31-2356, jzhang@, Jie Zhang, 20, 3.0
999-32-1234, ajain@, Anil Jain, 21, 3.8
Relationship set
3160, ITCS, DBMS, J. Fan, 3, Kenn. 236
6157, ITCS, Visual DB, J. Fan, 3, Kenn. 236
Database Management Systems
Raghu Ramakrishnan
21
Potential Relationship Types
1-to-1
Database Management Systems
1-to Many
Many-to-1
Raghu Ramakrishnan
Many-to-Many
22
Example 1

Build an ER Diagram for the following information:
– Students

Have an Id, Name, Login, Age, Gpa
– Courses

Have an Id, Name, Credit Hours
– Students enroll in courses

Receive a grade
Database Management Systems
Raghu Ramakrishnan
23
Example 1 Answer
Name
Id
Login
Students
Age
Id
GPA
Name
Credit
Courses
Enrolled_In
Grade
Database Management Systems
Raghu Ramakrishnan
24
Example 2

Build an ER Diagram for the following information:
– Patients

Name, Address, Phone #, Age
– Drugs

Name, Manufacturer , Expiration Date
– Patients are prescribed drugs

Dosage, # Days
Database Management Systems
Raghu Ramakrishnan
25
Example 2 Answer
Name
Addr
Patients
Phone
Name
Age
Manuf
Exp
Drug
Prescribed
Dosage
Database Management Systems
#days
Raghu Ramakrishnan
26
Example 3

Build an ER Diagram for the following information:
– Students

Have an Id, Name, Login, Age, Gpa
– Courses

Have an Id, Name, Credit Hours
– Students enroll in courses

Receive a grade
- faculties

Name, Address, Phone #, Age
– Faculties teach courses

semester
Database Management Systems
Raghu Ramakrishnan
27
Entity vs. Attribute: Ternary Relationship
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address must
be modeled as an entity (since attribute values are
atomic).
Database Management Systems
Raghu Ramakrishnan
28


name
from
to
dname
Works_In2 does not ssn
lot
did
budget
allow an employee to
Departments
Works_In2
work in a department for Employees
two or more periods.
Similar to the problem
of wanting to record
several addresses for an
name
dname
employee: we want to
ssn
lot
did
budget
record several values of the
Works_In3
Departments
Employees
descriptive attributes for
each instance of this
Duration
to
from
relationship.
Same employee works in same department in different periods
Database Management Systems
Raghu Ramakrishnan
29
Entity vs. Attribute
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address must
be modeled as an entity (since attribute values are
atomic).
Database Management Systems
Raghu Ramakrishnan
30
Entity vs. Relationship: Ternary Relationship


name
from
to
dname
Works_In2 does not ssn
lot
did
budget
allow an employee to
Departments
Works_In2
work in a department for Employees
two or more periods.
Similar to the problem
of wanting to record
several addresses for an
name
dname
employee: we want to
ssn
lot
did
budget
record several values of the
Works_In3
Departments
Employees
descriptive attributes for
each instance of this
Duration
to
from
relationship.
Database Management Systems
Raghu Ramakrishnan
31
Entity vs. Relationship: Ternary Relationship

First ER diagram OK if a manager gets a separate discretionary
budget for each dept.
– Redundancy of dbudget, which is stored for each dept managed by the
manager.
– Misleading: suggests dbudget tied to managed dept.

What if a manager gets a discretionary budget that covers all
managed depts?
Database Management Systems
Raghu Ramakrishnan
32
Entity vs. Relationship: Ternary Relationship

First ER diagram OK if
a manager gets a
separate discretionary
budget for each dept.
since
name
ssn
– Misleading: suggests
dbudget tied to managed
dept.
lot
Employees
– Redundancy of dbudget,
which is stored for each
dept managed by the
manager.
dbudget
dname
did
Departments
Manages2
name
ssn
What if a manager gets
a discretionary budget
that covers all managed
depts?
Database Management Systems
budget
dname
lot
Employees
did
Manages3
budget
Departments

since
apptnum
Mgr_Appts
dbudget
Raghu Ramakrishnan
33
Binary vs. Ternary Relationships*
name
ssn

If each policy is
owned by just 1
employee:
– Key constraint
on Policies
would mean
policy can only
cover 1
dependent!
Employees
Bad design
Policies
policyid
cost
name
ssn
age
Dependents
Covers
pname
lot
age
Dependents
Employees
Purchaser
Better design
Database Management Systems
pname
lot
policyid
Raghu Ramakrishnan
Beneficiary
Policies
cost
34
Binary vs. Ternary Relationships (Contd.)
name
dname
ssn
lot
did
budget
Works_In3
Employees
Duration
from
Departments
to
name
ssn
Employees
dname
lot
did
Works_In3
from
Database Management Systems
budget
Departments
to
Raghu Ramakrishnan
35
Binary vs. Ternary Relationships (Contd.)


Previous example illustrated a case when two binary
relationships were better than one ternary
relationship.
An example in the other direction: a ternary relation
Contracts relates entity set Parts, Departments and
Suppliers, and has descriptive attributes qty. No
combination of binary relationships is an adequate
substitute:
– S “can-supply” P, D “needs” P, and D “deals-with” S does not
imply that D has agreed to buy P from S.
– How do we record qty?
Database Management Systems
Raghu Ramakrishnan
36
Key Constraints
Consider
Works_In:
An employee can work
in many departments;
a dept can have many employees.
since
name
ssn
lot
Employees
Database Management Systems
dname
did
Works_In
Raghu Ramakrishnan
budget
Departments
37
Key Constraints
Consider
Works_In:
An employee can work
in at most one department;
a dept can have many employees.
since
name
ssn
lot
Employees
Database Management Systems
dname
did
Works_In
Raghu Ramakrishnan
budget
Departments
38
Key Constraints

In contrast, each
dept has at most
one manager,
according to the
key constraint on
Manages.
At most one!!!
since
name
ssn
dname
lot
did
Employees
Manages
budget
Departments
Key Constraint
(time constraint)
Database Management Systems
Raghu Ramakrishnan
39
Participation Constraints

Does every department have a manager?
– If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (vs. partial).
 Every did value in Departments table must appear in a
row of the Manages table (with a non-null ssn value!)
since
name
dname
ssn
did
lot
Employees
Partial
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
40
What are the policies behind this ER model?
since
name
dname
ssn
did
lot
Employees
Total
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
41
since
name
dname
ssn
lot
Employees
did
Manages
Any Difference?
budget
Departments
Works_In
since
name
dname
ssn
did
lot
Employees
Partial
Total
Manages
budget
Departments
Total w/key
constraint
Works_In
Total
since
Database Management Systems
Raghu Ramakrishnan
42
Weak Entities vs. Owner Entities

A weak entity can be identified uniquely only by
considering the primary key of another (owner) entity.
– Owner entity set and weak entity set must participate in a
one-to-many relationship set (1 owner, many weak entities).
– Weak entity set must have total participation in this
identifying relationship set.
name
ssn
Primary Key
for weak entity
lot
Employees
cost
Policy
Identifying Relationship
Database Management Systems
Raghu Ramakrishnan
pname
age
Dependents
Weak Entity
43
Ternary Relationship
name
dname
ssn
lot
Employees
from
did
Works_In3
budget
Departments
Duration
to
Why?
since
name
ssn
dname
lot
Employees
Database Management Systems
budget
did
Works_In
Departments
Raghu Ramakrishnan
44
name
ssn
ISA (`is a’) Hierarchies
As
in C++, or other PLs,
attributes are inherited.
hourly_wages
lot
Employees
hours_worked
ISA
contractid
If
we declare A ISA B, every A
Contract_Emps
Hourly_Emps
entity is also considered to be a B
entity.
 Overlap constraints: Can Joe be an Hourly_Emps as well as a
Contract_Emps entity? (Allowed/disallowed)
 Covering constraints: Does every Employees entity also have to
be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
 Reasons for using ISA:
– To add descriptive attributes specific to a subclass.
– To identify entitities that participate in a relationship.
Database Management Systems
Raghu Ramakrishnan
45
name
ssn
Aggregation

Used when we have
to model a
relationship
involving (entitity
sets and) a
relationship set.
– Aggregation allows us
to treat a relationship
set as an entity set
for purposes of
participation in
(other) relationships.
– Monitors mapped to
table like any other
relationship set.
Database Management Systems
lot
Employees
Monitors
until
Aggregation
started_on
pid
dname
pbudget
Projects
Raghu Ramakrishnan
did
Sponsors
budget
Departments
46
Database Management Systems
Raghu Ramakrishnan
47
Real Database Design

Build an ER Diagram for the following information:
– Walmart Stores

Store Id, Address, Phone #
– Products

Product Id, Description, Price
– Manufacturers

Name, Address, Phone #
– Walmart Stores carry products

Amount in store
– Manufacturers make products

Amount in factory/warehouses
Database Management Systems
Raghu Ramakrishnan
48
Conceptual Design Using the ER Model

Design choices:
– Should a concept be modeled as an entity or an attribute?
– Should a concept be modeled as an entity or a relationship?
– Identifying relationships: Binary or Ternary? Aggregation?
Database Management Systems
Raghu Ramakrishnan
49
Entity vs. Attribute
Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
 Depends upon the use we want to make of address
information, and the semantics of the data:

 If
we have several addresses per employee, address must
be an entity (since attributes cannot be set-valued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address must
be modeled as an entity (since attribute values are
atomic).
Database Management Systems
Raghu Ramakrishnan
50
Entity vs. Attribute (Contd.)


name
from
to
dname
Works_In2 does not ssn
lot
did
budget
allow an employee to
Departments
Works_In2
work in a department for Employees
two or more periods.
Similar to the problem
of wanting to record
several addresses for an
name
dname
employee: we want to
ssn
lot
did
budget
record several values of the
Works_In3
Departments
Employees
descriptive attributes for
each instance of this
Duration
to
from
relationship.
Database Management Systems
Raghu Ramakrishnan
51
Entity vs. Relationship

First ER diagram OK if
a manager gets a
separate discretionary
budget for each dept.
since
name
ssn
– Misleading: suggests
dbudget tied to managed
dept.
lot
Employees
– Redundancy of dbudget,
which is stored for each
dept managed by the
manager.
dbudget
dname
did
Departments
Manages2
name
ssn
What if a manager gets
a discretionary budget
that covers all managed
depts?
Database Management Systems
budget
dname
lot
Employees
did
Manages3
budget
Departments

since
apptnum
Mgr_Appts
dbudget
Raghu Ramakrishnan
52
Binary vs. Ternary Relationships*
name
ssn

If each policy is
owned by just 1
employee:
– Key constraint
on Policies
would mean
policy can only
cover 1
dependent!
Employees
Bad design
Policies
policyid
cost
name
ssn
age
Dependents
Covers
pname
lot
age
Dependents
Employees
Purchaser
Better design
Database Management Systems
pname
lot
policyid
Raghu Ramakrishnan
Beneficiary
Policies
cost
53
Binary vs. Ternary Relationships (Contd.)
name
dname
ssn
lot
did
budget
Works_In3
Employees
Duration
from
Departments
to
name
ssn
Employees
dname
lot
did
Works_In3
from
Database Management Systems
budget
Departments
to
Raghu Ramakrishnan
54
Binary vs. Ternary Relationships (Contd.)


Previous example illustrated a case when two binary
relationships were better than one ternary
relationship.
An example in the other direction: a ternary relation
Contracts relates entity set Parts, Departments and
Suppliers, and has descriptive attributes qty. No
combination of binary relationships is an adequate
substitute:
– S “can-supply” P, D “needs” P, and D “deals-with” S does not
imply that D has agreed to buy P from S.
– How do we record qty?
Database Management Systems
Raghu Ramakrishnan
55
Summary of Conceptual Design

Conceptual design follows requirements analysis,
– Yields a high-level description of data to be stored

ER model popular for conceptual design
– Constructs are expressive, close to the way people think
about their applications.
Basic constructs: entities, relationships, and attributes
(of entities and relationships).
 Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
 Note: There are many variations on ER model.

Database Management Systems
Raghu Ramakrishnan
56
Summary of ER (Contd.)

Several kinds of integrity constraints can be
expressed in the ER model: key constraints,
participation constraints, and overlap/covering constraints
for ISA hierarchies. Some foreign key constraints are
also implicit in the definition of a relationship set.
– Some constraints (notably, functional dependencies) cannot
be expressed in the ER model.
– Constraints play an important role in determining the best
database design for an enterprise.
Database Management Systems
Raghu Ramakrishnan
57
Summary of ER (Contd.)

ER design is subjective. There are often many ways to
model a given scenario! Analyzing alternatives can be
tricky, especially for a large enterprise. Common
choices include:
– Entity vs. attribute, entity vs. relationship, binary or n-ary
relationship, whether or not to use ISA hierarchies, and
whether or not to use aggregation.

Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are
especially useful.
Database Management Systems
Raghu Ramakrishnan
58
Erwin ER Modeling Tool


http://www.cai.com/products/alm/erwin.htm
Demo ERwin and it’s capabilities
– Open sample movies model

Erwin_3.5.2/models
– Build Example 2 using ERwin
Database Management Systems
Raghu Ramakrishnan
59
Homework Assignment

Problem 2.4 at the end of Chapter 2
– Pages 53
Due Next Thursday: Hard copy to instructor
 Format for homework: name, ID.

Database Management Systems
Raghu Ramakrishnan
60
Homework Assignment
since
name
dname
ssn
did
lot
Employees
Partial
Manages
Total
Departments
Total w/key
constraint
Works_In
partial
budget
Total
since
pname
Policy
age
Dependents
Key/total
cost
Database Management Systems
Raghu Ramakrishnan
61
Download