CS 579 Database Systems

advertisement
Theory, Practice & Methodology
of Relational Database
Design and Programming
Copyright © Ellis Cohen 2002-2008
Introduction to
Conceptual
Database Design
These slides are licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.
For more information on how you may use them,
please see http://www.openlineconsult.com/db
1
Overview of Lecture
Entity Classes
Relationships & ER Diagrams
1:M Relationship Design
Multiple Relationships &
The Fan Traps
Conceptual Design
Other ER Models
Mandatory Participation
Reflexive 1:M Relationships
Class Identification & Surrogate Keys
Redundancy & Anomalies
Simple Functional Dependencies
Simple Conceptual Normalization
© Ellis Cohen 2001-2008
2
Entity
Classes
© Ellis Cohen 2001-2008
3
Conceptual Modeling
Conceptual Modeling is a way of
designing systems involving
collections of tables by focusing on
• Entities
– an abstraction of tuples
• Entity Classes
– an abstraction of tables
• Relationships
– between entities in different entity
classes
Using diagrams called ER diagrams
(or Entity-Relationship Diagrams)
© Ellis Cohen 2001-2008
4
Tables as Themes
Employees
empno
ename
sal
comm
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7698
BLAKE
2850
7839
KING
5000
7844
TURNER
1500
7986
STERN
1500
0
A row represents
a single Employee
Every table has a theme
-- e.g. Employees
Every row represents an instance of that
theme -- e.g. a single Employee
© Ellis Cohen 2001-2008
5
Columns as Attributes
Employees
Primary Key
is underlined
Uniquely
identifies an
employee
empno
ename
sal
comm
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7698
BLAKE
2850
7839
KING
5000
7844
TURNER
1500
7986
STERN
1500
0
Every column represents an attribute
related to the theme -- e.g. the name
or salary of an Employee
© Ellis Cohen 2001-2008
6
Rows as Objects/Entities
Employees
empno
ename
sal
comm
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7698
BLAKE
2850
7839
KING
5000
7844
TURNER
1500
7986
STERN
1500
an Employee Object
empno: 7654
ename: MARTIN
sal: 1250
comm: 1400
0
It can be useful to think of each row as
an object or entity and the table as a
collection of these entities.
The columns of the table correspond to
the instance variables for each object
© Ellis Cohen 2001-2008
7
Entity Classes
In conceptual modeling,
we focus on the entity class
which represents the class of
entities with the same theme.
In general (but as we will see,
not always), an entity class is
implemented by a table in a
relational database
© Ellis Cohen 2001-2008
8
Modeling Entity Classes
Visual Conceptual Model (Crow Magnum)
Sometimes we don't
Sometimes we include
all the attributes
Employee
Employee
empno
ename
sal
comm
Sometimes we just
include the primary key
Employee
empno
Textual Conceptual Model (Brief ConText)
Employee( empno, ename, sal, comm )
© Ellis Cohen 2001-2008
9
Attributes Types
Keep attribute types simple
Complex attribute types often mean you need
to rethink your design or be more specific
Employee
empno
ename
sal
comm
 a number
 a string
 a dollar amount
 a dollar amount
© Ellis Cohen 2001-2008
10
Relationships
and ER Diagrams
© Ellis Cohen 2001-2008
11
ER (Entity-Relationship) Diagrams
(Crow Magnum style)
Depicts a relationship
between Employees
and Depts
Employee
empno
ename
sal
comm
relationship
characterization
works for
Dept
deptno
dname
Crows Foot
The Crow's foot at Employee means …
• A Dept can have MANY Employees
No Crow's foot at Dept, so …
• An Employee works for no more than ONE Dept
© Ellis Cohen 2001-2008
12
ER & Instance Diagrams
ER Diagram
works for
Employee
Dept
Relationship
Entity Class
Entity Class
*
Corresponds to links between instances of the related classes
Instance Diagram
Shows example instances and the links between them
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
1500
Entity Instances
Links
© Ellis Cohen 2001-2008
10
SALES
30
ACCOUNTING
Entity Instances
13
Instance Diagrams & Navigation
Links
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
1500
10
SALES
30
ACCOUNTING
Links are the basis of navigation between instances.
We will see later that there are SQL queries which
effectively find the tuples in one entity class which are
related to the tuples in another entity class. So we can
write SQL to find information about
• The dept associated with an employee
• All the employees who work for a department
© Ellis Cohen 2001-2008
14
Entities & Links as Persistent Data
Every
– entity instance of an entity class
– link instance of a relationship
Represents
– data persistently stored in a database
– information needed to answer a query
(e.g. what’s the salary of ALLEN?, what
department does ALLEN work in?)
The ONLY reason to represent an entity class or a
relationship in a conceptual model is because the
requirements clearly indicate they are necessary to
provide information needed for some query
© Ellis Cohen 2001-2008
15
1:M (1 to Many) Relationships
Child Entity Class
Employee
Parent Entity Class
works for
Dept
deptno
empno
an Employee works for (at most) 1 Dept
a Dept has any number (i.e. M) of Employees
© Ellis Cohen 2001-2008
16
Easy Crow Magnum Relationships
Visual Conceptual Model (Easy Crow Magnum)
Employee
Employee
empno
works for
works for
Dept
Dept
deptno
no attributes
shown in this
example
just primary
keys shown
in this
example
Easy Crow Magnum is meant for
quickly drawing designs on paper or a whiteboard
In Easy Crow Magnum,
don't draw the box outlines on entity classes
© Ellis Cohen 2001-2008
17
Visual & Textual Models for Relationships
VISUAL Conceptual Model (Crow Magnum)
Employee
works for
Dept
relationship characterization
TEXTUAL Conceptual Model (Brief ConText)
(*) Employee works for Dept
relationship characterization
© Ellis Cohen 2001-2008
18
Inverse Relationships
VISUAL Conceptual Model (Crow Magnum)
Dept
contains
Employee
relationship characterization
TEXTUAL Conceptual Model (Brief ConText)
Dept contains (*) Employee
relationship characterization
© Ellis Cohen 2001-2008
19
Indicating Relationship Direction
Employee
Employee
Employee
works
for
works
for
contains
Dept
Dept
Dept
Dept
Dept
Dept
contains
contains
works
for
Employee
Employee
Employee
Must indicate the relationship direction if it is not the
natural reading direction (L-to-R, top-to-bottom)
© Ellis Cohen 2001-2008
20
Naming Relationships
It is useful, though not required,
to name relationships in ConText
Entity Classes
Employee( empno, ename, addr )
Dept( deptno, dname )
Relationship
name
Relationships
Relationship
characterization
DeptAssignment: (*) Employee works for Dept
Choose a relationship name so that it
clearly and uniquely identifies the relationship
© Ellis Cohen 2001-2008
21
M:N (Many-to-Many) Relationships
Visual Conceptual Model (Crow Magnum)
Employee
assigned to
Project
Each employee may be assigned to
a number of projects
Each project may have
a number of employees
Textual Conceptual Model (Brief ConText)
(*) Employee assigned to (*) Project
© Ellis Cohen 2001-2008
22
M:N Related Instances
assigned to
Employee
Project
empno
empno ename
pno
address
7499
ALLEN
...
7654
MARTIN
...
7698
BLAKE
...
7839
KING
...
7844
TURNER
...
7986
STERN
...
pno
pname …
2618
…
2621
…
2622
…
© Ellis Cohen 2001-2008
23
1:1 (One-to-One) Relationships
Visual Conceptual Model (Crow Magnum)
Desk
assigned to
Employee
Each employee may be assigned to
at most one desk
Each desk may be assigned to
at most one employee
Textual Conceptual Model (Brief ConText)
Desk assigned to Employee
© Ellis Cohen 2001-2008
24
1:1 Related Instances
assigned to
Employee
Desk
empno
empno ename
deskno
address
7499
ALLEN
...
7654
MARTIN
...
7698
BLAKE
...
7839
KING
...
7844
TURNER
...
7986
STERN
...
deskno …
© Ellis Cohen 2001-2008
311
…
312
…
313
…
25
1:M Relationship
Design
© Ellis Cohen 2001-2008
26
1:M Relationship Exercise
Come up with an
Easy Crow Magnum ER Diagram of a 1:M
Relationship between two entity classes
(not a 1:1 or M:N relationship, and not
Dept/Employee)
Employee
works for
empno
ename
sal
Dept
deptno
dname
The diagram must
1.
2.
3.
4.
show the name of each entity class
show main attributes of each entity class
include the primary key (and underline it)
show the relationship characterization
© Ellis Cohen 2001-2008
27
Choosing Relationship
Characterizations
Employee
Employee
has
works
for
Dept
Dept
Dept
Dept
has
employs
Employee
Employee
ER diagrams are meant for communicating designs
as clearly as possible.
It is worth taking the time to choose the best
possible relationship characterization.
© Ellis Cohen 2001-2008
28
Mistaking 1:M for M:N
Person
likes
Ice Cream
Flavor
persid
flavid
It is common to consider relationships from only one perspective.
Think about a person. They like many ice cream flavors. Must be 1:M?
Person
likes
Ice Cream
Flavor
persid
flavid
But in fact, many persons like the same ice cream flavor, so it is
actually M:N. Specifying the relationship characterization is crucial.
The "favorite flavor" relationship is 1:M, but in the other direction!
Person
favorite flavor
Ice Cream
Flavor
persid
flavid
© Ellis Cohen 2001-2008
29
Choosing the Wrong Key
Person
owns
Book
persid
title
This case is somewhat similar to the previous one. A person owns
many books, each one identified by their title. So, it must be 1:M.
In fact, may people own a book with a particular title (e.g. there are lots
of copies of "Gone with the Wind"). The problem is that the wrong
primary key has been used.
Imagine that every copy of every book ever published was given a
unique serial number which uniquely identifies a single book instance.
That's what would be needed to really have a 1:M relationship!
Person
owns
persid
Book
serialnum
© Ellis Cohen 2001-2008
30
Singleton Classes
DON'T
DO
THIS
Boston
University
owns
Building
bldgno
DO THIS INSTEAD
Building
bldgno
OR
THIS
University
owns
name
Building
bldgno
Boston University is a singleton entity class, which
only has a single entity (i.e. a single tuple) in it.
Either leave it out entirely, or
replace it with a more general entity class
© Ellis Cohen 2001-2008
31
Singleton Classes & M:N Relationships
DON'T
DO
THIS
Microsoft
uses
Building
bldgno
DO THIS INSTEAD
Building
bldgno
OR
THIS
Company
uses
Building
name
bldgno
Generalizing singleton entity classes
can result in M:N relationships
© Ellis Cohen 2001-2008
32
Entity Classes vs. Attributes
DON'T
DO
THIS
Employee
makes
empno
ename
DO
THIS
INSTEAD
Salary
salary
Employee
empno
ename
salary
Don't create an entity class for something that really
does not need to have a "life of its own", but ought
to simply be an attribute of another entity class
© Ellis Cohen 2001-2008
33
Attribute vs. Entity Class Principles
Employee
empno
ename
deptno
Employee
empno
ename
Dept
deptno
Reasons for upgrading attributes to entity classes:
1. Substance: It emphasizes that a department is
something that can and should stand in its own right
2. Extensibility: One might want to add attributes specific
to a department (e.g. its name, location, etc.)
3. Multiplicity: It emphasizes that a department can have
multiple employees associated with it
4. Association: It emphasizes that an employees cannot
have an arbitrary deptno value, but that the employee
is associated with a department which has a specific
deptno
Note that none of these reasons make sense for Salary
(though multiplicity suggests that it might be useful to add a PayGrade
Class, and make salary an attribute of a PayGrade)
© Ellis Cohen 2001-2008
34
Break Up Complex Attributes
Employee
empno
ename
assignments
Employee
Do NOT use attributes that
contain complex sets of
details.
Break them up into
additional entity classes!
Assignment
empno
ename
Moreover, this should probably be replaced
by a relationship with a Project entity class
© Ellis Cohen 2001-2008
assnid
projnam
rate
startdate
35
Entity Attributes & Relationships
works for
Employee
Does NOT
include deptno
empno
ename
addr
deptno
*
Dept
deptno
dname
Employee does not contain an attribute that identifies
the Dept it is associated with (i.e. deptno).
An employee is certainly associated with a
department – but that is represented by the
relationship between Employee and Dept.
A deptno attribute in Employee would be an
entity attribute (it identifies a Dept entity).
This would not only be redundant (with the
works for relationship), but wrong to include
in a Conceptual Model [no conceptual foreign keys]
© Ellis Cohen 2001-2008
36
Relationships & Persistence
• A relationship represents
information which needs to be
persistently stored in a database!
• If the information doesn't need to
be stored and queried, don't
represent it as a relationship
*
• Don't include relationships which
simply show what a user can do or
keep track of what a user has done,
unless it is clear that information
will be needed later!
© Ellis Cohen 2001-2008
37
Multiple
Relationships
& The Fan Traps
© Ellis Cohen 2001-2008
38
Multiple Relationships
Team
Team
Coach
has
Player
has
Player
works
for
Team
has
enrolled
in
has
Child
Health Plan
Player
What do these diagrams mean?
© Ellis Cohen 2001-2008
39
The Fan Trap
Suppose there are multiple divisions in a company,
each divided into departments
Every employee works for a division (and is
assigned to a particular department in that division)
What's wrong with the diagram below?
Employee
works
for
divided
into
Division
© Ellis Cohen 2001-2008
Dept
40
Fan Trap Instances
7499
ALLEN
7654
MARTIN
7844
TURNER
7698
BLAKE
7986
…
10
SALES
30
ACCOUNTING
DIV A
STERN
DIV B
…
It is impossible to determine which department an
employee is assigned to!
© Ellis Cohen 2001-2008
41
The Reverse Fan Trap
Suppose a company has multiple divisions
Every employee is employed by a division, and
assigned to a particular department (in that division)
What's wrong with this diagram?
works
for
employs
Division
divno
Employee
empno
© Ellis Cohen 2001-2008
Dept
deptno
42
Reverse Fan Trap Instances
DIV A
DIV B
7499
ALLEN
7654
MARTIN
7844
TURNER
7698
BLAKE
7986
STERN
…
10
SALES
30
ACCOUNTING
…
Two employees in the same department could be
assigned to different divisions
Is there any way to prevent this when using this model?
© Ellis Cohen 2001-2008
43
Business Rules &
The Reverse Fan Trap
works
for
employs
Division
divno
Employee
Dept
empno
deptno
+ Two employees who are in the same department
must be in the same division
We can prevent this problem by adding the
business rule above!
But how would this be enforced?
Well, we'll see that we can write code that detects
every time a change is made to the data in the
database which might violate this business rule, and
then ensures that the business rule is enforced!
Are there any other problems with this model?
© Ellis Cohen 2001-2008
44
Deletion Anomalies &
The Reverse Fan Trap
DIV A
DIV B
7499
ALLEN
7654
MARTIN
7844
TURNER
7698
BLAKE
7986
STERN
…
10
SALES
30
ACCOUNTING
…
Suppose STERN is the only employee in department 30.
If STERN is terminated, there is no longer any way, to
determine that dept 30 is in division B!
So maybe we should try another model?
© Ellis Cohen 2001-2008
45
Resolving the Fan Traps
Employee
7499
ALLEN
7654
MARTIN
7844
TURNER
7698
BLAKE
7986
STERN
works
for
Dept
10
part
of
Division
SALES
DIV A
30
ACCOUNTING
…
DIV B
…
It is now possible to determine each employee's
department & each department's division!
We can also still tell which division an employee is
assigned to, by following the link from the employee to the
dept, and then from the dept to the division
© Ellis Cohen 2001-2008
46
Conceptual
Design
© Ellis Cohen 2001-2008
47
Database Design Levels
Conceptual Design / Model
Model of the database design in terms
that users will understand
Logical Design / Model
Description of the design in terms that
can be directly used to build a database
(This is called a Relational Model,
if we are building an RDB)
Physical Design / Model
Additional design descriptions that
specify or affect the data representation
in physical storage
© Ellis Cohen 2001-2008
48
Database Design Process
Requirements
Conceptual Design
Conceptual Model
A model used for
communication with system
analysts and UI designers
Relational Mapping
Actual Design of Tables in
the Database
Relational Model
Physical Mapping
Physical Model using DDL & DCL
© Ellis Cohen 2001-2008
49
Conceptual Modeling
Conceptual Modeling
(also known as Conceptual Design)
starts with two activities
• Identify Entity Classes
• Identify Relationships
between them
© Ellis Cohen 2001-2008
50
Identify Entity Classes
Identify the classes of entities called
for to support requirements
Think of an entity as
–
a readily identifiable thing
A potential entity class should have
1. Multiple instances (i.e. the
requirements should imply the need
for more than one entity of that class)
2. A clear set of attributes based on the
requirements
3. A primary key to uniquely identify
tuples
© Ellis Cohen 2001-2008
51
Design Exercise
A university is divided into
departments.
Each department is made up of
faculty members.
Each department may have a number
of degree-granting programs.
Every student may only declare a
major in a single program.
Every student may be assigned a
faculty member as an advisor.
What are the entity classes needed?
© Ellis Cohen 2001-2008
52
Entity Classes Needed
University: MAYBE, but it sounds like it's a
singleton class
Department: YES
Faculty Member: YES
Student: YES
Person: MAYBE (ignore for now)
(Degree-Granting) Program: YES
Major: MAYBE (but it is simpler to just treat
major as a relationship) between a
student and a program
Advisor: MAYBE (but it is simpler to treat
advise as a relationship between Faculty
Member and Student (also, Advisor is a
subclass of Faculty Member, which is a bit too
complicated to get into just yet)
© Ellis Cohen 2001-2008
53
Identify Relationships
If the database needs to
persistently keep track of a
link/association between an
entity of one class and an
entity of another class
The entity classes are related to
one another
Draw the Easy Crow Magnum diagram based on
the identified entity classes. All 1:M Relationships!
© Ellis Cohen 2001-2008
54
Design Solution
offers
Department
made up of
Faculty
Member
Program
majors in
advises
Student
Department
Faculty
Member
Faculty members sometimes have appointments in
more than one department. The relationship would
be 1:M if the requirement were " Each department is
made up of faculty members; and each faculty
member is only in one department, at most"
© Ellis Cohen 2001-2008
55
Relationship Exercise
Bikeshop
has in stock
Bike
sells
instance of
Bike Model
part of
Part
used on
instance of
Part Model
The model above describes the parts on bikes sold by a
chain of bike-shops (i.e. there are multiple bike-shops).
It contains both 1:M and M:N relationships,
but the Crow's Feet have all been removed.
Put them back where they belong!
© Ellis Cohen 2001-2008
56
Relationship Exercise Answer
Bikeshop
has in stock
Bike
sells
instance of
used on
part of
Part
Bike Model
instance of
© Ellis Cohen 2001-2008
Part Model
57
Uses of 1:M Relationships
1. Relationship between independent entity classes
Employee
enrolled
in
Dental Plan
2. Aggregation (Part/Whole) Relationship
Part
part of
Bike
3. Instance/Category Relationship
Part
instance of
© Ellis Cohen 2001-2008
Part Model
58
Entity Class Specificity
Whenever we design entity classes, it is
important to think VERY CAREFULLY
about its specificity
When designing the entity class Auto, we
could mean
– A kind of auto: Individual entities would be
identified by model and year: e.g. 1984 Honda
Accord. Better to name this class Auto Model
– A specific auto: Individual entities would be
identified by their VIN (vehicle identification
number): e.g. 614HT37994PL7394, but might
also have a model and year (or maybe not)
Sometimes a design will need multiple
levels of specificity (related by a 1:M
relationship – e.g. Bike & Bike Model).
© Ellis Cohen 2001-2008
59
Other
ER Models
© Ellis Cohen 2001-2008
60
Chen 1:M ER Model
Child Entity Class
Crow
Magnum
Employee
empno
ename
addr
Chen
Employee
Parent Entity Class
works for
works
for
Dept
deptno
dname
Dept
Key constraint: The primary key of Employee not
only uniquely identifies an employee (and their
name & addr), but also uniquely identifies their
associated department (its deptno & dname)
© Ellis Cohen 2001-2008
61
UML 1:M ER Model
Child Entity Class
Crow
Magnum
Parent Entity Class
works for
Employee
Dept
Associations (i.e. relationships) reflect the
associations that one entity should or needs to have
with other entities.
A department needs to be associated with
multiple (*) employees – all the employees that work for it
UML
Employee
*
works for
0..1
Dept
An employee only needs to be associated with, at
most, a single department – the department it is in
© Ellis Cohen 2001-2008
62
UML with Attributes
Crow
Magnum
works for
Employee
Dept
empno
ename
addr
UML
Employee
PK
*
deptno
dname
works for
empno
ename
addr
0..1
Dept
PK
deptno
dname
Using UML's icon notation for PK
© Ellis Cohen 2001-2008
63
Primary Key Representations in UML
Dept
UML
«PK»
deptno
dname
Use UML's icon notation.
An icon is a predefined
arbitrary graphical
PK
symbol, in this case,
to be used in place of a
stereotype
Dept
UML
PK
deptno
dname
Dept
UML
Use UML's stereotype notation,
which is a way of associating
domain-specific characteristics,
enclosed in «guillemets», with
a UML element
deptno {PK}
dname
Use UML's property notation,
which is a way of
associating properties,
enclosed in {curly braces},
with a UML element
© Ellis Cohen 2001-2008
64
UML Aggregation
Crow
Magnum
part of
Part
Bike
When a 1:M relationship is a part/whole relationship
(e.g. a bicycle part is a part of a bike), then
UML uses a special aggregation symbol to depict it
UML
Part
*
0..1
Bike
In rare circumstances, UML also allows aggregation
with M:N relationships as well
© Ellis Cohen 2001-2008
65
1:M Relationships
Crow
Magnum
Easy
Crow
Magnum
UML
Chen
works for
Employee
works for
Employee
Employee
Employee
Dept
*
works for
works
for
© Ellis Cohen 2001-2008
Dept
0..1
Dept
Dept
66
1:1 Relationships
Crow
Magnum
Easy
Crow
Magnum
UML
Chen
sits at
Employee
sits at
Employee
Employee
Employee
Desk
0..1
sits at
sits at
© Ellis Cohen 2001-2008
Desk
0..1
Desk
Desk
67
M:N Relationships
Crow
Magnum
Easy
Crow
Magnum
UML
Chen
assigned to
Employee
Project
assigned to
Employee
Employee
*
assigned to
assigned
to
Employee
Project
*
Project
Project
No key constraints!
© Ellis Cohen 2001-2008
68
Mandatory
Participation
© Ellis Cohen 2001-2008
69
Child Participation
Child Entity Class
Employee
Parent Entity Class
works for
Dept
Mandatory: An Employee MUST work for 1 Dept
Employee
works for
Dept
Indeterminate
Deferred participation design decision
© Ellis Cohen 2001-2008
70
Mandatory Child Participation
(Every employee is assigned to 1 department)
Child Entity Class
Employee
Parent Entity Class
works for
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
Dept
10
SALES
30
ACCOUNTING
1500
Every Employee participates
in a relationship with a Dept
© Ellis Cohen 2001-2008
71
Non-Mandatory Child Participation
(There may be employees with no dept)
Child Entity Class
Employee
Parent Entity Class
works for
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
Dept
10
SALES
30
ACCOUNTING
1500
There are Employees who don't participate
in a relationship with a Dept
© Ellis Cohen 2001-2008
72
Mandatory Child Participation
in UML
Child Entity Class
UML
Employee
Parent Entity Class
*
works for
1
Dept
an Employee must be assigned to 1 Dept
Crow
Magnum
Employee
works for
© Ellis Cohen 2001-2008
Dept
73
Parent Participation
Child Entity Class
Parent Entity Class
works for
Employee
Dept
Mandatory: MUST be 1 Employee in every Dept
Employee
works for
Dept
Indeterminate:
Deferred participation design decision
© Ellis Cohen 2001-2008
74
Mandatory Parent Participation
(Every department has at least 1 employee)
Child Entity Class
Employee
Parent Entity Class
works for
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
Dept
10
SALES
30
ACCOUNTING
1500
Every Dept participates
in a relationship with an Employee
© Ellis Cohen 2001-2008
75
Non-Mandatory Parent Participation
(There may be depts with no employees)
Child Entity Class
Parent Entity Class
works for
Employee
7499
ALLEN
1600
300
7654
MARTIN
1250
1400
7844
TURNER
1500
0
7698
BLAKE
2850
7986
STERN
Dept
10
SALES
20
RESEARCH
30
ACCOUNTING
1500
There are Depts who don't participate
in a relationship with Employees
© Ellis Cohen 2001-2008
76
Mandatory Parent Participation
in UML
Child Entity Class
UML
Employee
Parent Entity Class
1..*
works for
0..1
Dept
a Dept must have at least 1 employee
Crow
Magnum
Employee
works for
© Ellis Cohen 2001-2008
Dept
77
Reflexive
1:M Relationships
© Ellis Cohen 2001-2008
78
Reflexive Entity Attributes
Suppose an employee can have a manager,
(who is another employee)
Employee
DON'T
DO THIS!
empno
ename
addr
mgr
This could identify
an employee's
manager.
Perhaps it would
hold the empno or
ename of the
manager.
IS THIS OK OR A BAD IDEA?
© Ellis Cohen 2001-2008
79
Entity Attributes & Relationships
Employee
DON'T DO
THIS!
works
for
empno
ename
addr
deptno
Dept
deptno
dname
What's wrong with adding deptno as
an attribute of Employee?
© Ellis Cohen 2001-2008
80
Attributes vs Relationships
Employee
DON'T
DO THIS!
empno
ename
addr
mgr
mgr would be an
entity attribute –
an attribute whose value
identifies some other entity –
in this case,
some other employee
But entity attributes
reflect relationships –
e.g. an employee is related
(by the manages relationship)
to the employee
who is their manager.
The conceptual model is meant to show relationships.
Replace entity attributes by relationships!
© Ellis Cohen 2001-2008
81
Reflexive Relationships
Visual Conceptual Model (Crow Magnum)
An employee
may manage
other employees
Employee
empno
ename
addr
manages
Textual Conceptual Model (Brief ConText)
Employee( empno, ename, addr )
This does not
imply that a
manager can
manage
themselves
(which would
probably be a
bad idea!)
It doesn’t
disallow it either.
We'll see how to
do that later.
Employee manages (*) Employee
© Ellis Cohen 2001-2008
82
Reflexive Relationships
& Instance Hierarchies
7839
7698
BLAKE
KING
5000
2850
7566
7654
7844
TURNER
1500
MARTIN
0
1250
7499
JONES
2975
1400
ALLEN
1600
300
Reflexive relationships commonly describe entity hierarchies:
KING manages BLAKE & JONES
BLAKE manages MARTIN & TURNER
JONES manages ALLEN, etc.
© Ellis Cohen 2001-2008
83
Complete Conceptual Model
with Attributes
Employee
works for
empno
ename
addr
Dept
deptno
dname
manages
Sometimes, designers draw a more detailed diagram that
includes the attributes of an entity class
(sometimes JUST the primary key attributes)
© Ellis Cohen 2001-2008
84
Visual/Textual Conceptual Model
Visual Conceptual Model (Crow Magnum)
Employee
empno
manages
works for
Dept
Note: here we
chose to just show
the primary keys
deptno
Textual Conceptual Model (Brief ConText)
Entity Classes
Employee( empno, ename, addr )
Dept( deptno, dname )
Note: MUST
list all
conceptual
attributes here
Relationships
WorksFor: (*) Employee works for Dept
Manages: Employee manages (*) Employee
© Ellis Cohen 2001-2008
85
Class Identification
& Surrogate Keys
© Ellis Cohen 2001-2008
86
Entity Class Identification Exercise
A clothing manufacturer identifies each style of item
they make by their own unique stylecode. Items
within the same style vary by size and color, and
each such item is given an itemsku. Each style of
item has a category identified by its catid.
Create a conceptual model (UML or Easy Crow
Magnum ER Diagram) for the manufacturer's
database, which includes the following attributes
itemsku
stylecode
stylenam
styledate
catid
catnam
size
color
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
'FX311B-24M'
'302'
'Hunting Bikini Brief'
1992 (when introduced)
'MU'
'Mens Underwear'
9
'red'
© Ellis Cohen 2001-2008
87
Answer to
Entity Class Identification Exercise
VISUAL Conceptual Model (Easy Crow Magnum)
Item
itemsku
size
color
has
Style
has
Category
stylecode
stylenam
styledate
catid
catnam
TEXTUAL Conceptual Model (Brief ConText Entity Classes)
Item( itemsku, size, color )
Style( stylecode, stylenam, styledate )
Category( catid, catnam )
ItemStyle: (*) Item has Style
StyleCategory: (*) Style has Category
© Ellis Cohen 2001-2008
88
Surrogate Primary Key
Item
itemsku
size
color
Item
vs
itemid
itemsku
size
color
Surrogate primary key:
A new attribute added to an entity class or relation
and used in place of the original primary key.
Why would you add a surrogate key?
Note: We often wait to add surrogate
keys until we map the conceptual
model to a relational model
© Ellis Cohen 2001-2008
89
Simple Candidate Keys
A simple candidate key is any attribute
which uniquely identifies a tuple
Item
itemid
itemsku
size
color
Simple
Candidate
Keys
Both itemid and itemsku
uniquely identify an item
A designer chooses a primary key from
one of the candidate keys
© Ellis Cohen 2001-2008
90
Redundancy
& Anomalies
© Ellis Cohen 2001-2008
91
Normalization Problem
In designing a conceptual model
for employees with the
attributes empno, ename,
deptno, dname, addr
What's wrong with just using a
single entity class:
Employee( empno, ename,
deptno, dname, addr )?
© Ellis Cohen 2001-2008
92
Answer: What's Wrong …
Entity Class Principles: substance,
extensibility, multiplicity,
association argue that there should
be a Dept class
Redundancy: deptno & dname
Extra Work: If changed name of a
department, would have to do it in
multiple places
Anomalies: Could change deptno
without changing dname or vice
versa.
© Ellis Cohen 2001-2008
93
Redundancy: deptno  dname
Employee
empno
deptno
dname
30
SALES
7698
30
SALES
7839
10
ACCOUNTING
7844
30
SALES
7986
50
SUPPORT
7654
…
…
• Entities with the same value for deptno have the same
value for dname
• Including dname in the entity class is redundant, since
it can be derived from deptno
Redundancy causes duplicate work
Suppose the company wants to change deptno 30 to be
the Sales & Marketing department. That change must
be made to multiple employees
© Ellis Cohen 2001-2008
94
Redundancy and Anomaly
Redundancy can cause anomalies
(inconsistencies) if modifications are not
done carefully
• Update Anomaly:
– Updating a value in a single cell can make the
database inconsistent
• Insertion Anomaly:
– Adding an entity can make the database
inconsistent
• Deletion Anomaly:
– Deleting some information can make the
database inconsistent or cause unintended loss
of information
© Ellis Cohen 2001-2008
95
Anomaly Examples
Employee
empno
deptno
dname
30
SALES
7698
30
SALES
7839
10
ACCOUNTING
7844
30
SALES
7986
50
SUPPORT
7654
…
…
Modification Anomaly:
Modify 7654's dname to 'SUPPORT'
(without changing its deptno)
Insert Anomaly:
Insert a new employee with a deptno of 20,
and a dname of 'SUPPORT'
Delete Anomaly:
Delete employee 7986 (it’s the only employee in
SUPPORT, and no other entity class keeps track
that dept 50 is SUPPORT)
© Ellis Cohen 2001-2008
96
Simple Functional
Dependencies
© Ellis Cohen 2001-2008
97
Redundancy and
Functional Dependencies
Functional Dependencies
• Specify which attributes in a
entity class are determined by
other attributes
• Identify potential redundancies
• Help us see how to eliminate
those redundancies (generating
the conceptual model we really
should have produced initially!)
© Ellis Cohen 2001-2008
98
Functional Dependencies (FD's)
Dependencies among attributes
AB
A functionally determines B
B functionally depends on A
The value of A uniquely determines
a single value for B
If two or more entities (of a specific
entity class) have the same value for
A, they have the same value for B
(e.g. Every employee that has the same value for
deptno – e.g. 30 has the same value for
dname – e.g. SALES)
© Ellis Cohen 2001-2008
99
FD's for a Normalized Example
Employee( empno, ename, addr )
empno  ename
empno  addr
empno can be used to lookup
(and therefore uniquely
determine) all the other
attributes of an Employee tuple
This can also be written as
empno  ename, addr
or
empno  { ename, addr }
Also
empno  empno
(this is a trivial FD, which we usually don't write)
© Ellis Cohen 2001-2008
100
Determinants & Dependents
empno
Determinant

addr
Dependent
© Ellis Cohen 2001-2008
101
FD's for an Example with Redundancy
Employee( empno, ename, deptno, dname, addr )
empno
empno
empno
empno




ename
deptno
dname
addr
However, also
This is a problem!
deptno is NOT a
candidate key
It indicates
redundancy!
deptno  dname
(possibly) dname  deptno
© Ellis Cohen 2001-2008
102
Redundancy: deptno  dname
Employee
empno
deptno
dname
30
SALES
7698
30
SALES
7839
10
ACCOUNTING
7844
30
SALES
7986
50
SUPPORT
7654
…
…
Because deptno is not a candidate key,
the same deptno value (e.g. 30) can appear multiple times.
But deptno  dname,
that is, two tuples with the same value of deptno have the
same value of dname
Voila! REDUNDANCY!
© Ellis Cohen 2001-2008
103
Simple FD Exercise
Assume you have (foolishly) designed a
single Item entity class containing all of the
following attributes
itemsku
stylecode
stylenam
styledate
catid
catnam
size
color
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
'FX311B-24M'
'302'
'Hunting Bikini Brief'
1992 (when introduced)
'MU'
'Mens Underwear'
9
'red'
Find all the simple FD's
whose determinant is not a candidate key
© Ellis Cohen 2001-2008
104
Non-Candidate-Key FD's
catid  catnam
stylecode  stylenam
stylecode  styledate
stylecode  catid
stylecode  catnam
• follows by transitive inference,
because
stylecode  catid & catid  catnam
© Ellis Cohen 2001-2008
105
Simple
Conceptual
Normalization
© Ellis Cohen 2001-2008
106
Simple Conceptual Normalization
Employee( empno, ename, deptno, dname, addr )
deptno  dname
Given an entity class with a
(non-trivial) functional dependency
whose determinant is NOT a candidate key
– Split out a new entity class
– Make the determinant the primary key (or at
least a candidate key) of the new class
– Move all attributes that depend on it
Employee( empno, ename, addr )
Dept( deptno, dname )
Note: Most books only discuss Normalization at the
Relational Design level. However, Conceptual
Normalization, though not complete, is a way to
improve a conceptual design. We'll examine
Normalization in much more detail later in the term
© Ellis Cohen 2001-2008
107
Result of
Simple Conceptual Normalization
Each
simple conceptual normalization step
– adds one entity class
– adds one 1:M relationship link
Employee
empno
ename
deptno
dname
addr
Employee
has
empno
ename
addr
Dept
deptno
dname
deptno 
dname
© Ellis Cohen 2001-2008
108
Conceptual Normalization Exercise
Assume you have designed an Item entity class with
the following attributes
itemsku
stylecode
stylenam
styledate
catid
catnam
size
color
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
e.g.
'FX311B-24M'
'302'
'Hunting Bikini Brief'
1992 (when introduced)
'MU'
'Mens Underwear'
9
'red'
Find an FD with a non-candidate-key determinant, and
use Conceptual Normalization to split out a new
entity class.
Continue doing this with all of the resulting entity
classes until each of them are in Conceptual Normal
Form.
© Ellis Cohen 2001-2008
109
ER Decomposition (a)
Item
itemsku
Step
stylecode
1
stylenam
styledate
stylecode 
catid
stylenam,
catnam
styledate,
size
catid
color
catnam
has
Item
itemsku
size
color
has
Item
Style
itemsku
size
color
Style
stylecode
stylenam
styledate
catid
catnam
Step
2
catid 
catnam
has
stylecode
stylenam
styledate
Category
catid
catnam
Each simple conceptual normalization step
• adds one entity class
• adds one relationship link
© Ellis Cohen 2001-2008
110
ER Decomposition (b)
Item
itemsku
stylecode
stylenam
styledate
catid
catnam
size
color
Item
itemsku
size
color
has
Item
itemsku
stylecode
stylenam
styledate
size
color
Step
1
catid 
catnam
has
Style
has
Category
catid
catnam
Step
2
stylecode 
stylenam,
styledate
Category
stylecode
stylenam
styledate
catid
catnam
Each simple conceptual normalization step
• adds one entity class
• adds one relationship link
© Ellis Cohen 2001-2008
111
Database Design Process
Requirements
Conceptual Design
& Conceptual Normalization
Conceptual Model
Relational Mapping
& Relational Normalization
Relational Model
Physical Mapping
Physical Model using DDL & DCL
© Ellis Cohen 2001-2008
112
Download