The Entity-Relationship Model

advertisement
Chapter 2

Conceptual design: (ER Model is used at this
stage.)
◦
◦
◦
◦
◦
What are the entities and relationships in the enterprise?
What information about these entities and relationships
should we store in the database?
What are the integrity constraints or business rules that
hold?
A database `schema’ in the ER Model can be represented
pictorially (ER diagrams).
Can we map an ER diagram into a relational schema.


The Entity relationship (ER) data model allows us to describe the data
involved in a real-world enterprise in terms of objectives and their
relationships and its widely used to develop an initial database
design. It provides a movements from what users want into what can
be implemented in a DBMS.
Database design can be divided into six steps. ER model is
most relevant to the first three:
1. Requirement analysis: to understand what data is to be stored in
the database, what application must be built on top of it, and
what operations are most frequent and subject to performance
requirements. (what the users want from the database)
2. Conceptual database design: The information gathered in the
requirement analysis step is used to develop a high level
description of the data to be stored in the database. This steps
often uses ER model, that is one of several high level, or
semantic data model.
3. Logical database design: this task is to convert the conceptual
database design (ER schema) into a relational database schema
(logical schema).
4. Schema refinement: analyze the collection of relations in our
relational database schema to identify potential problems, and to
refine it.
5. Physical database design: we consider typical expected
workloads that our database must support and further refine the
database design to ensure that it meets desired performance
criteria. (e.g. building indexes on some tables, clustering some
tables or redesign of parts of database schema).
6. Application and security design: Design methodology like UML
try to address complete software design and development cycle.
Briefly, we must identify the entities, processes involved in the
application, describe the role of each entity in every process and
what part of the database that needs to be accessed by this role.






The main construct for representing data in the relational
model is a relation.
A relation consists of a relational schema and relational
instance. The relational instance is a table and relational
schema describes the column heads for the table.
Example of relational schema is : Student(Sid: string,
name: string, age: integer).
An instance of a relation is a set of tuples, also called
records, in which each tuple has the same number of
fields as in the relational schema.
A relational database is a collection of relations with
distinct relation names.
The relational database schema is the collection of
schemas for the relations in the database.
ssn
name
age
1223-133
Attishoo
48
1122-989
Danial
24
ssn
name
age
Employees

The ER model is initial and high level database design.
Entity: Real-world object distinguishable from other objects.

Entity Set: A collection of similar entities. E.g., MIS faculty and

management faculty may both contain Dr Akram. Means we may
have one entity set names faculty that may contain all faculty in
these two departments
◦
◦
◦


All entities in an entity set have the same set of attributes. (Until we
consider ISA hierarchies, anyway!)
Each entity set has a key.
Each attribute has a domain.
Given an ER diagram, describing a database, a standard approach
is taken to generating relational database schema.
The following SQL can be used for translating ER to Relational
database schema:
CREATE TABLE Employee (ssn
CHAR (11),
NAME CHAR(30),
age
INTEGER,
PRIMARY KEY (SSN));
since
name
ssn
lot
Employees

dname
did
Works_In
budget
Departments
Relationship: Association among two or more entities.
E.g., Works_In is a relationship set in which each
relationship indicates a department in which an employee
works.
Create table Works_In (
ssn Char(11),
did Integer,
Since Date
Primary Key (ssn, did)
Foreign key (ssn) references Employee
Foreign Key (did) references Departments
name
ssn
since
name
ssn
lot
dname
lot
did
budget
Employees

Employees
Works_In
Address
Locations
Departments
Capacity
supervisor
subordinate
Reports_To
Ternary Relationship:
Suppose that each department has offices in several locations and we want to
record the locations at which each employee works. This type of
relationship called ternary because we must record an association between
an employee, department and a location.
A relationship sometimes involve two entities in the same entity set. For
example, Reports_To relationship shows that employee report to other
employee. Every relationship in Report_To is of the form (emp1,emp2)
where both emp1 and emp2 are entities in Employees entity sets. They play
different role:
emp1 reports to the managing employee emp2.
since
name
ssn
dname
lot
Employees


did
Manages
budget
Departments
Consider Works_In: An Employee can work in many
departments; a dept can have many employees.
In contrast, check the Manages relationship in the above
figure: each dept has at most one manager, although a
single employee is allowed to manage more than one
department. The restriction of each dept that has at most
one manager is an example of a key constraint which implies
that each department entity appears in at most one
managers relationship.
since
name
ssn


dname
lot
did
Employees
Works_In
Address
Locations
budget
Departments
Capacity
Each employee works in at most one department and at a
single location. Each department can be associated with
several employees and locations and each location can be
associated with several departments and employees.
But each employee associated with a single department and
location.

The key constraint on manages tells us that a department has
at most one manager. But, does every department have a
manager?
◦
If so, this is a participation constraint: the participation of
Departments in Manages is said to be total (otherwise it is partial).
 The participation of the entity set Employees in manages is partial ..
Why .. Because not every employee gets to manage a department.
 In works_In relationship set, it is natural to expect that each
employee works in at least one department and that each
department has at least on employee. Means the participation of
both employee and department in Works_In is total. (presented by
thick line)
name
ssn
did
lot
Employees
dname
since
Manages
Works_In
since
budget
Departments






Suppose that Employees can purchase insurance policies to cover
their Dependents. We wish to record information about policies
including who is covered by each policy, but this information is our
only interest in the dependence of an Employees.
We might choose to identify a Dependent by name alone in this
situation, since it is reasonable to expect that the dependants of a
given employee have different names. So pname and age are only
attributes in Dependents entity.
Therefore attribute pname dose not identify a dependant uniquely.
Recall that the key for Employees is ssn, thus we might have two
employees called John and each might have a son called Joe.
Dependents is an example of a weak entity set.
A weak entity can be identified uniquely only by considering some of
its attributes (pname) in conjunction with the primary key (ssn) of
another (owner) entity.
◦ Owner entity set and weak entity set must participate in a one-tomany relationship set (one owner, many weak entities).
◦ Weak entity set must have total participation in this identifying
relationship set.
◦ The arrow from Dependents to Policy indicates that each
Dependents entity appears in at least one policy relationship.





Example: Dependent entity can be identifies uniquely only if we take the
key of the owning Employee entity (ssn) and the pname of the
dependents entity.
The set of attributes of a weak entity set that uniquely identify a weak
entity for a given owner entity called a partial key of the weak entity set.
In our example, pname is a partial key for Dependents.
The total participation of Dependants in Policy is indicated by linking
them with dark line. The arrow from Dependents to Policy indicates that
each Dependents entity appears in at most one Policy relationship
To underscore the fact that Dependents is a weak entity and Policy is its
identifying relationship, we draw both with dark lines. To indicate that
pname is a partial key for Dependents, we underline it using a broken
line. Means there may well be two dependents with the same pname
value.
name
ssn
lot
Employees
cost
Policy
pname
age
Dependents
name
ssn
lot
 Some times it is natural to classify the
entities into subclasses. E.g.
Employees
hourly_wages
Hourly_Emps entity with attributes
like hourse_worked and hourly_wage set
hours_worked
ISA
and Contract_Emps entity with
contractid
attribute contractid to distinguish the
basis on which they are paid.
Contract_Emps
Hourly_Emps
 We want the semantics that every
entity in one of these sets is also in
 Overlap constraints: Can Joe be an
Hourly_Emps as well as a
Employees entity, and as such must
Contract_Emps entity? Intuitively
have all the attributes of Employees
no
defined.
 Covering constraints: Does every
We say that attributes in Employee are Employees entity also have to be an
Hourly_Emps or a Contract_Emps
inherited by the entity Hourly_Emps
entity? Intuitively no
and Contract_Emps. (same as C++,
 Reasons for using ISA:
Java, etc.)
◦ To add descriptive attributes
If we declare A ISA B,
specific to a subclass.
every A entity is also considered to be a ◦ To identify entitities that
participate in a relationship.
B entity.
name
ssn

Used when we have to
model a relationship
involving entities and
a relationships.
◦ Aggregation allows
us to treat a
relationship set as
an entity set for
purposes of
participation in
(other)
relationships.
lot
Employees
Monitors
since
started_on
pid
pbudget
Projects
until
dname
did
Sponsors
budget
Departments
* Each project is sponsored by one or more departments, and each department
sponsored by at least one project (total).
*A department that sponsors a project might assign employees to monitor the
sponsorship. Thus monitors should be a relationship set that associates a
Sponsors relationship with an Employee entity. This relationship called
aggregation.
*Aggregation vs. ternary relationship:
 Monitors is a distinct relationship, with a descriptive attribute.
 Also, can say that each sponsorship is monitored by at most one employee.
Design choices:
 Developing an ER diagram presents several
choices, including:
◦
◦
◦
Should a concept be modeled as an entity or an attribute?
Should a concept be modeled as an entity or a
relationship?
Identifying relationships: Binary or ternary? Aggregation?


Should address be an attribute of Employees or an entity
(connected to Employees by a relationship)?
Depends upon the use we want to make of address
information, and the semantics of the data:
 If we have several addresses per employee, address
must be an entity (since attributes cannot be setvalued).
 If the structure (city, street, etc.) is important, e.g., we
want to retrieve employees in a given city, address
must be modeled as an entity (since attribute values
are atomic).

Works_In4 does not
allow an employee to
work in a department
for two or more periods.
ssn

Similar to the problem
of wanting to record
several addresses for an
employee: We want to
record several values of
the descriptive
attributes for each
instance of this
relationship.
Accomplished by
introducing new entity
set, Duration.
from
name
to
dname
lot
did
Works_In4
Employees
budget
Departments
name
dname
ssn
lot
Employees
from
did
Works_In4
Duration
budget
Departments
to



Suppose that each
department manager is
given flexible budget
(dbudget).
First ER diagram OK if
a manager gets a
separate discretionary
budget for each dept.
What if a manager gets
a discretionary
budget that covers
all managed depts?
◦ Redundancy:
dbudget stored for
each dept managed
by manager.
◦ Misleading: Suggests
dbudget associated
with departmentmgr combination.
since
name
ssn
dbudget
lot
Employees
dname
did
budget
Departments
Manages2
name
ssn
lot
dname
since
did
Employees
ISA
Managers
Manages2
dbudget
budget
Departments
This fixes the
problem!

Conceptual design follows requirements analysis,
◦

ER model popular for conceptual design
◦



Yields a high-level description of data to be stored
Constructs are expressive, close to the way people think
about their applications.
Basic constructs: entities, relationships, and
attributes (of entities and relationships).
Some additional constructs: weak entities, ISA
hierarchies, and aggregation.
Note: There are many variations on ER model.

Several kinds of integrity constraints can be
expressed in the ER model: key constraints,
participation constraints, and overlap/covering
constraints for ISA hierarchies. Some foreign key
constraints are also implicit in the definition of a
relationship set.
◦
◦
Some constraints (notably, functional dependencies)
cannot be expressed in the ER model.
Constraints play an important role in determining the
best database design for an enterprise.

ER design is subjective. There are often many ways
to model a given scenario! Analyzing alternatives
can be tricky, especially for a large enterprise.
Common choices include:
◦

Entity vs. attribute, entity vs. relationship, binary or n-ary
relationship, whether or not to use ISA hierarchies, and
whether or not to use aggregation.
Ensuring good database design: resulting relational
schema should be analyzed and refined further. FD
information and normalization techniques are
especially useful.

Explain the following terms briefly and give an
example: attribute, domain, entity, relationship, one-to-many
relationship, many-to-many relationship, participation constraint, overlap
constraint, covering constraint, weak entity set, aggregation.


A university database contains information about
professors (identified by SSN) and courses (identified
by courseid). Professors teach courses; the following
situation concerns the Teaches relationship set.
Draw an ER diagram that describes it (assuming no
further constraints hold).
1. Professors can teach the same course in several semesters, and each offering
must be recorded.
Download