Uploaded by Jasjiv Singh

ER Model

advertisement
Entity Relationship Model
(ER Model)
Dimitra Vista
1
Levels of Design
Database applications are large and sophisticated
•
Requirements Analysis – what are we building?
•
Conceptual database design – high level description of the data
•
Logical database design – formal schema (must choose a DBMS)
•
Schema refinement – reduce data redundancy
•
Physical database design – choose indexes, storage, etc.
•
Application and security design – complete software design
Domain experts must drive requirements
Dimitra Vista
2
Conceptual Design
•
Conceptual database design – high level description of the data
•
Collect user requirements
•
Information that needs to be represented
•
Operations that need to be supported
•
Several techniques for representing this information (e.g., UML)
•
Develop a conceptual schema for the database
•
A high-level representation
•
Follows a specific data model (ER model)
•
Often represented graphically (ER diagram)
Dimitra Vista
3
Conceptual Design – Why Do We Care?
•
Typical large company
•
1000s of databases (90% of them relational)
•
100s of tables in each database
•
50-200 columns in each table
•
Huge amount of effort to resolve inconsistencies / violations
•
It is hard to design applications
Dimitra Vista
4
Entity-Relationship Model (ER Model)
•
Used for the conceptual design of a database
•
Introduced by Chen in 1976
•
A very common model for database design
•
Allows the specification of the schema in graphical terms
•
A picture is worth a thousand words!
•
Easier for other participants to understand
•
Basic concepts are simple, but model allows for sophisticated concepts
•
Can be mapped automatically to the relational model
Key insight: Some designs are clearly wrong, but
there are often multiple correct ones!
Dimitra Vista
5
ER Model Basics
•
Entity - a real-world object distinguishable from other objects, e.g., Mary Jones,
CS123
•
Entity set - a collection of similar entities, where similar means (for now) that they
share the same attributes
•
Attribute - a characteristic of an entity, e.g., name, age, social security number
(SSN)
Dimitra Vista
6
Alternative Representations
Dimitra Vista
7
Keys
•
Each entity set must have a key attribute (a.k.a. an identifier)
– a key uniquely identifies an entity within the entity set
– a key is either simple (1 attribute) or composite (multiple attributes)
•
Keys are important! They allow us to:
– unambiguously refer to an entity within a set
– link together entities across entity sets
Dimitra Vista
8
Attributes
•
Each attribute has a domain - a set of possible values (e.g., GPA, gender, name)
•
Required vs. optional (e.g., last name vs. middle name)
•
Simple vs. composite (e.g., age vs. address)
•
Single-valued vs. multi-valued (e.g., car make vs. color)
•
Stored vs. derived (e.g., DOB vs. age) - advantages / disadvantages?
Dimitra Vista
9
Relationships and Relationship Sets
•
A relationship is an association between two or more entities, e.g., Jane works in IT
•
A relationship set is a collection of similar relationships, e.g., Employees work in
Departments
Dimitra Vista
10
Descriptive Attributes
A relationship set can have descriptive attributes
Dimitra Vista
11
N-ary Relationship Sets
An n-ary relationship set R associates n entity sets E1, E2, ...., En. Each relationship r
relates entities e1∈ E1, e2 ∈ E2, ..., en∈ En.
This is a ternary relationship
Dimitra Vista
12
What is the difference?
•
Should a relationship be a ternary relationship or two binary relationships?
Lecturer
Lecturer
teaches
lectures
Course
Room
vs
Course
Dimitra Vista
teaches
Room
13
What is the difference?
•
Should a relationship be a ternary relationship or two binary relationships?
Lecturer
Lecturer
teaches
lectures
Course
Room
vs
Course
teaches
Room
Problem if the lecturer teaches more than one course: Second design
cannot represent which room is assigned for which course
Dimitra Vista
14
Recursive Relationship Sets
The same entity set can take on different roles in the same relationship set. This is
called a recursive relationship set.
Dimitra Vista
15
An Instance of a Relationship Set
•
A relationship belongs to (is an element of) a relationship set, and so must be
uniquely identifiable by participating entities.
•
Meaning: each (employee, department) pair appears in the relationship set only
once. Because it’s a set!
Dimitra Vista
16
Examples
Draw ER diagrams representing entity sets and relationship sets described
below.
Cars(vin, plate, state)
Parking(lot, address)
Assigned_To()
vin
plate
Car
Dimitra Vista
lot
state
assigned_do
address
Parking
17
Entity vs Attribute
•
It is not always easy to tell whether an attribute warrants creating an
entity set of its own
•
Difficult cases
– composite attributes, e.g., address
– multi-valued attributes, e.g., colors of a car (exterior /interior)
•
Rule of thumb:
– composite attributes should be modeled as entity sets if we care
about their structure
•
Rule of thumb:
– multi-valued attributes should be modeled as entity sets
Dimitra Vista
18
Composite Attributes
•
Composite attributes allow us to divide attributes into subparts (other attributes).
street
name
city
state
address
EMPLOYEES
Dimitra Vista
19
What is the difference?
street
street
city
city
state
state
Address
name
address
EMPLOYEES
Dimitra Vista
name
lives
vs
EMPLOYEES
20
What is the difference?
street
street
city
city
state
state
Address
name
address
EMPLOYEES
name
lives
vs
EMPLOYEES
1) The second permits multiple addresses for an employee
2) Composite attributes should be modeled as entity sets if we
care about their structure
Dimitra Vista
21
Multi-Valued Attributes
•
Multi-valued attributes allow us to have attributes that can have multiple values
ssn
name
phone
EMPLOYEES
Dimitra Vista
22
Multi-Valued Attributes vs Entity Set?
•
Multi-valued attributes allow us to have attributes that can have multiple values
area
name
ssn
EMPLOYEES
phone
number
Phone
vs
ssn
name
has
EMPLOYEES
Rule of thumb:
multi-valued attributes should be modeled as entity sets
Dimitra Vista
23
Derived Attributes
•
Derived attributes allow us to compute rather than store the value of an attribute
ssn
dob
age
EMPLOYEES
Dimitra Vista
24
What’s wrong with this design?
Dimitra Vista
25
What’s wrong with this design?
• For each pair of employee and department you can have only
one value for from/to
• What’s a better design?
Dimitra Vista
26
A Better Design
Dimitra Vista
27
Examples
Draw ER diagrams representing entity sets and relationship sets described below.
•
Courses are offered by departments. A course may be offered in multiple terms,
and each offering must be recorded.
cid
did
name
Courses
Offered_by
name
Departments
term
Offering
Dimitra Vista
28
Examples
Draw ER diagrams representing entity sets and relationship sets described below.
•
Courses are offered by departments. A course may be offered in multiple terms,
and only the most recent offering must be recorded.
cid
name
Courses
Dimitra Vista
term
Offered_by
did
name
Departments
29
ER Model – Beyond the Basics
•
cardinality
•
key constraints
•
participation constraints
•
weak entities
•
ISA (is a) relationship
•
entity clustering
Dimitra Vista
30
Cardinality
Dimitra Vista
31
Constraining Relationships
•
Key constraints: entity relates at most once
Teacher
•
teaches
Participation constraints: entity relates at least once
Teacher
Dimitra Vista
teaches
32
Key Constraints
•
Business rule: A professor may teach 0, 1 or multiple classes. Each class is taught
by at most one professor.
Dimitra Vista
33
Key Constraints
•
Business rule: A professor teaches at most one class. Each class is taught by at
most one professor.
Dimitra Vista
34
Participation Constraints
Business rule: A professor may teach 0, 1 or multiple classes. Each class is
taught by some professor.
Dimitra Vista
35
Participation Constraints
Business rule: Each professor teaches some class. Each class is taught by
some professor.
Dimitra Vista
36
Participation Constraints
Business rule: Each professor teaches some class. Each class is taught by
exactly one professor.
Dimitra Vista
37
Explain the Business Rules
Lecturer
teaches
Course
Business rules:
Each lecturer teaches at most one course
Each course must be taught by some lecturer
Dimitra Vista
38
Explain the Business Rules
Student
favorite
Course
Business rules:
Each student has exactly one favorite course
A course does not have to be the favorite of some student
Dimitra Vista
39
Explain the Business Rules
Business rules:
Each department is managed by exactly one employee.
Each employee works in some department. Each department has some
employee working in it.
Dimitra Vista
40
Example
Employees have managers. An employee has at most one manager. A
manager may manage any number of employees.
Dimitra Vista
41
Example (Solution)
Employees have managers. An employee has at most one manager. A
manager may manage any number of employees.
Dimitra Vista
42
Example
Professors advise students. Each professor advises at most one student.
Each student has exactly one advisor.
Dimitra Vista
43
Example (Solution)
Professors advise students. Each professor advises at most one student.
Each student has exactly one advisor.
Dimitra Vista
44
Example
•
An author is described by a name and a date of birth (dob).No two
authors have the same combination of name and dob.
•
A book is described by an ISBN, a title and a year when it was written
(year). No two books have the same ISBN number.
•
A book is written by exactly one author.
•
An author is only included in the database is she has authored at least
one book.
Dimitra Vista
45
Example (Solution)
•
An author is described by a name and a date of birth (dob).No two
authors have the same combination of name and dob.
•
A book is described by an ISBN, a title and a year when it was written
(year). No two books have the same ISBN number.
•
A book is written by exactly one author.
•
An author is only included in the database is she has authored at least
one book.
Dimitra Vista
46
Example
•
A scientist is described by a name and a date of birth (dob). No two scientists
have the same combination of name and dob.
•
A discovery has a name. No two discoveries have the same name.
•
An award has a name, which uniquely identifies it.
•
Scientists make discoveries. A discovery is made by one or several scientists. A
scientist who made no discoveries is not tracked in our database.
•
Scientists receive awards. A scientist may receive the same award more than once in
different years. All years in which a scientist received a particular award must be
recorded.
Dimitra Vista
47
Example (Solution)
•
A scientist is described by a name and a date of birth (dob). No two scientists
have the same combination of name and dob.
•
A discovery has a name. No two discoveries have the same name.
•
An award has a name, which uniquely identifies it.
•
Scientists make discoveries. A discovery is made by one or several scientists. A
scientist who made no discoveries is not tracked in our database.
•
Scientists receive awards. A scientist may receive the same award more than once in
different years. All years in which a scientist received a particular award must be
recorded.
Dimitra Vista
48
Example
Chefs work at restaurants. A chef is uniquely identified by a SSN, and is also
described by a name and a cuisine in which she specialized. A restaurant is uniquely
identified by a combination of name and city. Each chef works in at least one
restaurant, and each restaurant must have at least one chef working at it. Some
chefs own restaurants, and if a chef owns a restaurant - she is its sole owner.
Dimitra Vista
49
Example (Solution)
Chefs work at restaurants. A chef is uniquely identified by a SSN, and is also
described by a name and a cuisine in which she specialized. A restaurant is uniquely
identified by a combination of name and city. Each chef works in at least one
restaurant, and each restaurant must have at least one chef working at it. Some
chefs own restaurants, and if a chef owns a restaurant - she is its sole owner.
Dimitra Vista
50
A Note on Keys
•
A key in a binary relationship is clear
Lecturer
A lecturer teaches at most one course
If you know the lecturer, you know the course
teaches
Course
Dimitra Vista
51
A Note on Keys
A key in a ternary relationship is not so clear
•
Lecturer
Lecturer
Course
teaches
Lecturer is the key
Dimitra Vista
Room
Course
teaches
Room
Lecturer is a key
Course is also a key
52
Key and Participation Constraints: Recap
•
Key constraint
– denoted by a line with an arrowhead towards the relationship set
– occurs only in 1-to-many or 1-to-1 relationship sets
– example: a course is taught by one / at most one / exactly one
professor
•
Participation constraint
– denoted by a bold line towards the relationship set
– can occur on the 1 or on the many side of a relationship set
– example: a course must be taught by a professor
– example: a course is taught by at least one professor
Dimitra Vista
53
Weak Entities
•
The identifying owner entity and the weak entity must participate in a
one-to-many relationship set. This relationship set is called the
identifying relationship set of the weak entity.
•
The weak entity must have total participation in the identifying
relationship set.
Dimitra Vista
54
In Summary
At most one
At least one
Exactly one
Weak entity
Dimitra Vista
55
Extended ER-Model
•
Aggregation (aka entity clustering)
– Models the relationships of a relationship
•
”is-a” relationship
– Like subclasses in an object-oriented language
– Avoids data redundancy
Dimitra Vista
56
Extended ER Model - Aggregation
•
Aggregation is used when we have to model a relationship with another
relationship
•
Allows us to treat a relationship set as an entity set for purposes of
participation in (other) relationships
Dimitra Vista
57
A Night at the Opera
“Let’s develop the conceptual data model for an opera production.”
•
What are the entities?
•
What are the relationships?
•
What are the business rules?
Dimitra Vista
58
Opera: Entity Sets
•
Opera(title, year) - may have more than one opera by the same title, but not in
the same year (e.g., Otello - Verdi 1887 /Rossini 1816)
•
Composer (name, birth, death) - may have more than one composer with the
same name, but not also with the same birth and death years
•
Librettist (name, birth, death)
•
Company (name)
•
Venue (name, city)
•
Production - an entity set or a relationship set, linking Opera and
Company?
•
Performance - an entity set or a relationship set, linking Production and Venue?
Dimitra Vista
59
Opera: Relationship Sets
•
An Opera is written by Composer / Librettist
– each Composer composed at least one Opera
– each Librettist wrote the libretto for at least one Opera
•
An Opera by the same name may be written by a different (Composer,
Librettist) pair
•
A piece is produced by a Company
•
A production is performed at a Venue
Dimitra Vista
60
Opera: The Model
Dimitra Vista
61
ISA (is a) Relationships
•
A ”ISA” B means that every A is also a B
– A inherits all the attributes from A
– Use them to add descriptive attributes only to the subclass
– Use it to describe relationships with the superclass
name
ssn
PERSON
salary
INSTRUCTORS
Dimitra Vista
ISA
gpa
STUDENTS
62
Is-A Relationships
•
Like subclasses in an object-oriented language
•
They reduce data redundancy (you don’t have to repeat data that exist in
the parent)
– Subclasses inherit attributes from the parent
– Subclasses inherit relationships from the parent
No multiple inheritance, single parent hierarchy (tree) !!!
Dimitra Vista
63
ISA (is-a) Relationships
•
Overlap constraints determine whether two subclasses are allowed to
contain the same entity
•
Covering constraints determine whether the entities in the subclasses
collectively include all entities in the superclass
PERSON
ISA
INSTRUCTORS
PERSON
Overlaps
STUDENTS
A person can be both an
instructor and a student
Dimitra Vista
ISA
INSTRUCTORS
Covers
STUDENTS
A person must be either an
instructor or a student
64
Conceptual Design Using The ER Model
Design Choices
•
Entity vs attribute?
•
Entity vs relationship?
•
Binary vs ternary relationship?
•
Aggregation vs ternary relationship?
Constraints
•
A lot of data semantics can (and should) be captured
•
But some constraints cannot be captured in ER diagrams
•
Check the book for more examples and more explanations
Dimitra Vista
65
Design Choices: Entity or Attribute?
• Should address be an attribute of Employee or an entity (connected to
Employee by a relationship?
ssn
Employee
num
ssn
address
vs
Employee
has
street
city
Address
It depends.
•
If we have several addresses per employee, address must be an entity
(since attributes cannot be set-valued)
•
If the address is important as a structure (e.g., we are doing locationbased analytics), address must be modeled as an entity
Dimitra Vista
66
Design Choices: Entity or Relationship?
•
Should visit be a relationship or an entity?
date
date
Patient
•
visit
Visit
Doctor
vs
Patient
sees
Doctor
If visit is a relationship only one date is allowed
Dimitra Vista
67
Design Choices: Binary or Ternary Relationship?
•
Should a relationship between 3 entities be a ternary relationship or two
binary relationships?
Policy
Policy
Employee
•
covers
Dependent
vs
purchases
covers
Employee
Dependent
First design does not allow a policy to cover multiple dependents
Dimitra Vista
68
Design Choices: Binary or Ternary Relationship?
•
Should a relationship between 3 entities be a ternary relationship or two
binary relationships?
Part
Supplier
Part
contract
Department
vs
cansupply
needs
Supplier
Department
qty
•
In the second design, where should the qty attribute go?
Dimitra Vista
69
Design Choices: Ternary Relationship or
Aggregation?
Should a relationship be a ternary relationship or an aggregation?
•
Opera
Company
Opera
written
written
Composer
produced
Composer
Company
vs
•
First design does not allow an opera to be written unless it is produced
by a company
Dimitra Vista
70
In Summary
•
Lots of subtle points
•
In database design, these choices matter a lot!
•
Hard to consider all possible scenarios in advance, but in principle, if you
want to work with a database, you must make those choices
Database design is an iterative process!
Dimitra Vista
71
ER Modeling Tools
Many of them translate to SQL automatically
•
ER/Studio
•
Enterprise Architect
•
ERWin
•
PgModeler, PgDesigner
•
UML tools
•
and many others
https://dbmstools.com/categories/database-diagram-tools/postgresql
Precise visualization may differ, concepts are similar
Dimitra Vista
72
Alternative Visualization
This is called crow’s foot notation
Dimitra Vista
73
Music Brainz Database
Dimitra Vista
74
Lessons
•
Lesson 1: A picture is worth a thousand words
•
Lesson 2: Conceptual modeling follows requirements analysis
•
Lesson 3: ER models are not complete, but approximate designs (subjective)
Dimitra Vista
75
Acknowledgements
Some slides in this course are inspired or copied from
•
https://www.cs.drexel.edu/~julia/cs461/ by Julia Stoyanovich, Drexel U,
Spring 2018
•
https://sfu-db.github.io/cmpt354/ by Jiannan Wang - Simon Fraser, Fall
2018
•
https://www.db-book.com/db7/slides-dir/index.html by Avi Silberschatz,
Henry F. Korth, S. Sudarshan
•
https://www.youtube.com/watch?v=i7iugcmrBJ8&list=PLXPbT_PYOiRipf
X8zrv_9EpnSOpK9P__j, Immanuel Trummer, Connell University, Intro to
Database Systems
•
https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_otTrB
Trjyohi, Andy Pavlo, CMU, Intro to Database Systems
Dimitra Vista
76
Download