MIS 335 - Database Systems
http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim
Learning Objectives
•
•
•
Database Design and ER Diagrams
• Requirements Analysis: find out what the users want from the database
– What data is to be stored in the DB
– What applications must be built on top of it
– What operations are most frequent and subject to performance requirements
• Conceptual Database Design: create a simple description of the data that closely matches how users and developers think of the data
– A high-level (semantic) description of data to be stored in the DB along with the constraints known to hold over this data
– Carried out using the ER Model
• Logical Database Design: choose DBMS to implement conceptual database design
– Convert conceptual DB design (ER schema) into a DB schema in the data model of the chosen DBMS (relational DB schema)
DB Design
• Schema Refinement
– Analyze the collection of relations in relational DB schema to identify potential problems and refine it (- Normalization of the relations)
• Physical DB Design
– Consider expected workloads to refine for meeting the desired performance criteria
– Building indexes on tables
– Clustering some tables
– Redesign of parts of the DB schema
• Application and Security Design
– Identify entities (users, departments) and relevant roles of each entity
– Enforce access rules: For each role, identify the parts of the DB that must be accessible and must not be accessible
Entity-Relationship Model
•
– Collect the requirements
– Build a conceptual database design
•
– Widely accepted standard for initial (conceptual) database design
Entity-Relationship Model
• Conceptual DB design:
– What information about these entities and relationships should we store in the database?
– What are the integrity constraints or business rules that hold?
• A database `schema’ in the ER Model can be represented pictorially
– ER diagrams
• Can map an ER diagram into a relational schema
Entity-Relationship Diagram ssn name
Employees lot cost
Policy pname age
Dependents
Key Concepts of ER Model
•
– An object that is capable of independent existence and can be uniquely identified (can be distinguished from other objects)
Item
Employee
Student
•
ssn sid type
Key Concepts of ER Model
•
– A collection of similar entities
• share the same set of properties/attributes
• Reflects the level of detail at to represent information about entities
Students
Onur Alp
Zubeyde
Arzu
Esra
Ahmet
Key Concepts of ER Model
•
– Any example?
Students
Employees
Onur Alp
Mert
Zubeyde
Esra
Emrecan
Arzu Ahmet
Mehmet
Key Concepts of ER Model
• Each entity sets has attributes
• Each attribute has a domain
– Domain: set of permitted values
• name attribute – (set of 20-character string)
• age attribute – (set of integers between 0-150)
• Each entity set has a key
– minimal set of attributes whose values uniquely identify an entity in the set
– denoted by underlining the attribute name in the ERdiagram
Employee ssn address name
Key Concepts of ER Model
•
– Association (relation) among two or more entities
– Ahmet is enrolled in MIS335
Works_In
Enrolled
•
– A collection of similar relationships
– Share the same properties
Key Concepts of ER Model
•
– Descriptive attributes: used to record the information about the relationship
– Ahmet Works_In University since 2014
Employee Works_In ssn address name since
ER Model sid Student
Enrolled name semester
Rectangles : Entity sets
Diamonds : Relationship Sets
Ellipses/Oval : Attributes cname
Course cid
ER Model
•
•
sid Student name
Enrolled semester
Course cid cname
ER Model
•
address
Locations capacity ssn
Employee name
Works_In since did
Departments dname budget
ER Model
• The set of entities that participate in a relationship set may belong to the same entity set
• Each entity plays a different role in such a relationship
Employees ssn supervisor
Employees name subordinate
Reports_To
Reports_To =>
Unary relationship
ER Model
• The set of entities that participate in a relationship set may belong to the same entity set
• Each entity plays a different role in such a relationship
Students sid tutor
Student tutee name
Helps
Cardinality Mappings
• One-to-One (1-1)
– One occurrence of an entity relates to only one occurrence in another entity
– rarely exists in practice
• consider combining them into one entity
– Example: an employee is allocated a company car, which can only be driven by that employee
• One-to-Many (1-M) / Many-to-One (M-1)
– One occurrence in an entity relates to many occurrences in another entity
– Example: an employee works in one department but a department has many employees.
Cardinality Mappings
• Many-to-Many (M-N)
– Many occurrences in an entity relate to many occurrences in another entity
– The normalisation process would prevent any such relationships
– Rarely exist
• They occur because an entity has been missed.
– Example: an employee may work on several projects at the same time and a project has a team of many employees.
– In the normalisation process this many-to-many is resolved by the entity Project Team.
Cardinality Mappings
1-to-1 1-to-Many Many-to-1 Many-to-Many
ER Model – Key Constraints
Employees ssn name
Works_In since
An employee can Work In multiple departments and a department can have multiple employees.
What is the type of this relationship?
Departments did dname since
Many-to-Many
ER Model – Key Constraints
Employees Manages Departments ssn name since
An employee can Manage multiple departments, but a department can be managed by only one employee (Manager)
What is the type of this relationship?
This is called a key constraint (the restriction that each department has at most one manager) denoted by an arrow did since dname
1-to-Many
Department with did = ‘51’ violates the key constraint of the Manages relationship
Instance of Manages relationship that satisfies the key constraint of the Manages relationship
Participation Constraints
• If every department is required to have a manager, this requirement is a participation constraint
• The participation of the entity set Departments in the relationship set Manages is total
• The participation of the entity set Employees in the relationship set
Manages is partial
– Since not every employee gets to manage a department
• Total participation constraint of an Entity set in a relationship set is indicated by connecting them by thick line
Employees Manages Departments dname ssn name did since since
Participation Constraints
• If each employee works in at least one department, and if each department has at least one employee
– Total or Partial Participation of Employees & Departments entities since
Works_In
Employees ssn name
Manages since
Departments did dname since
• Classify entities into subclasses
• Every entity in a subclass also belongs to superclass (Employees)
• The attributes for the entity set
Employees are inherited by the entity set Hourly_Emps
• Hourly_Emps ISA Employees
• Reasons for using ISA:
• To add descriptive attributes specific to a subclass.
• To identify entities that participate in a relationship hourly_wages
Employees
Hourly_Emps hours_worked ssn
ISA name
Contract_Emps contractid
• Specialization: process of identifying subsets of an entity set (Employees) that share some distinguishing characteristic
– Employees is specialized into subclasses ssn name
Employees
• Generalization: process of identifying some common characteristics of a collection of entity sets and creating a new entity set that contains entities possessing these common characteristics
– Hourly_Emps and Contract_Emps are generalized by Employees
Hourly_Emps
ISA
Contract_Emps hourly_wages hours_worked contractid
• Overlap Constraints: determine whether two subclasses are allowed to contain the same entity
– Can Ahmet belong to both Contract_Emps entity and Hourly_Emps?
• Covering Constraints: determine whether the entities in the subclasses collectively include all entities in the superclass
– Does every Employees entity have to belong to one of Hourly_Emps and Contract_Emps?
ssn
Employees
Hourly_Emps
ISA name
Contract_Emps hourly_wages hours_worked contractid
Weak Entities
• Weak Entity: Entity set that does not include a key
• A weak entity can be identified uniquely only by considering the primary key of another entity (called identifying owner )
– Set of attributes of a weak entity set that uniquely identify a weak entity for a given owner entity => partial key
• A weak entity set is denoted by a rectangle with thick lines
Employees Policy pname ssn name cost age
Weak Entities
• A weak entity can be identified uniquely only by considering the primary key of another entity (called identifying owner )
• A weak entity set is denoted by a rectangle with thick lines
• The relationship between a week entity and the owner entity is denoted by a diamond with thick lines
Employees ssn name pname cost age
Weak Entities
• A weak entity can be identified uniquely only by considering the primary key of another entity (called identifying owner )
• What can you say about the constraints on the identifying relationship? (i.e., participation and key constraints)
Employees ssn name pname cost age
Weak Entities
• What can you say about the constraints on the identifying relationship? (i.e., participation and key constraints)
– Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities)
– Weak entity set must have total participation in this
identifying relationship set
Employees pname ssn name cost age
Aggregation
• Used to indicate that a relationship set
(denoted by a dashed box) participates in another relationship set
– Allows us to treat a relationship set as an entity set for purposes of participation in other relationships
Projects ssn
Employees
Monitors name until started_on pid pbudget
• Aggregation vs. Ternary relationship:
– Monitors is a distinct relationship, with a descriptive attribute
– Also, can say that each sponsorship is monitored by at most one employee
Sponsors did
Departments dname since budget
• Design choices:
– Should a concept be modeled as an entity or an attribute?
– Should a concept be modeled as an entity or a relationship?
– Identifying relationships: Binary or ternary?
Aggregation?
• Constraints in the ER Model:
– A lot of data semantics can (and should) be captured
– But some constraints cannot be captured in ER diagrams
Entity vs. Attribute
• Should address be an attribute of Employees or an entity (connected to Employees by a relationship)?
• Depends upon the use we want to make of address information, and the semantics of the data:
– If only one address is to be recorded per employee
• Use attribute ‘address’
– If we have several addresses per employee
• address must be an entity (since attributes cannot be set-valued)
– If we want to capture the structure (break down address into country, city, street, etc.) of an address
• e.g., we want to retrieve employees in a given city
• address must be modeled as an entity (since attribute values are
atomic)
Entity vs. Attribute
• Works_In does not allow an employee to work in a department for two or more periods
– This possibility is ruled out by the ER diagram’s semantic, because relationship is uniquely identified by the participating entities (without reference to its descriptive attributes)
• Similar to the problem of wanting to record several addresses for an employee
– We want to record several values of the descriptive attributes for each instance of this relationship
– Accomplished by introducing new entity set, Duration
Employees name name from
Employees ssn ssn from
Works_In
Duration
Works_In to did did
Departments budget to
Departments budget dname dname
Entity vs. Relationship
• ER diagram is OK if a manager gets a separate discretionary budget for each department
• What if a manager gets a discretionary budget that covers all managed departments?
– Redundancy: dbudget stored for each dept managed by manager
– Misleading: Suggests dbudget associated with departmentmgr combination
Employees name name
Employees ssn ssn since dbudget
Manages did
Departments budget dname
Manages Departments
ISA did dname
Managers since dbudget budget
Entity vs. Relationship
• ER diagram is OK if a manager gets a separate discretionary budget for each department
• What if a manager gets a discretionary budget that covers all managed departments?
– Redundancy: dbudget stored for each dept managed by manager
– Misleading: Suggests dbudget associated with departmentmgr combination
Employees name name
Employees ssn ssn since dbudget
Manages since did
Departments budget dname
Manages Departments
Redundancies are eliminated by Normalization technique
ISA did dname
Managers dbudget budget
Binary vs. Ternary Relationship name pname ssn
Covers age
Dependents
• Models the situation where;
– An employee can own several policies
Employees
– Each policy can be owned by several employees
– Each dependent can be covered by several policies
Policies policyid cost
Binary vs. Ternary Relationship name
• If we have additional requirements;
– A policy cannot be owned jointly by two or more employees
– Every policy must be owned by some employee
– Dependents is a weak entity, and uniquely identified by taking pname in conjunction with policyid of a policy entity name
• ER diagram is inaccurate
Employees
Bad design ssn ssn
Covers
Policies policyid cost pname
Dependents pname age age
• What are the additional constraints in the 2nd diagram?
Employees
Dependents
Better design
Purchaser
Beneficiary
Policies policyid cost
•
•
•
– S “can-supply” P, D “needs” P, and D “deals-with”
S does not imply that D has agreed to buy P from S
– How do we record qty?
Summary of Conceptual Design
• Conceptual design follows requirements analysis
– Yields a high-level description of data to be stored
• ER model popular for conceptual design
– Constructs are expressive, close to the way people think about their applications
• Basic constructs
– entities, relationships, and attributes (of entities and relationships)
• Some additional constructs
– weak entities, ISA hierarchies, and aggregation
• Note: There are many variations on ER model
Summary of Conceptual Design
• Several kinds of integrity constraints can be expressed in the ER model:
– key constraints
– participation constraints
– overlap/covering constraints for ISA hierarchies
• Some foreign key constraints are also implicit in the definition of a relationship set
– Some constraints (notably, functional dependencies) cannot be expressed in the ER model
• Constraints play an important role in determining the best database design for an enterprise
Summary of Conceptual Design
• ER design is subjective
• There are often many ways (alternatives) to model a given scenario
• Common choices include:
– Entity vs. attribute, Entity vs. relationship
– Binary or n-ary relationship
– Whether or not to use ISA hierarchies / aggregation
• To ensuring good database design:
– Resulting relational schema should be analyzed and refined further
– FD information and normalization techniques are especially useful
ER Modeling Question - 0
•
– entity, relationship, entity set, relationship set,
– attribute, domain,
– one-to-many relationship, many-to-many relationship,
– participation constraint, overlap constraint, covering constraint,
– weak entity set, aggregation, role indicator.
ER Modeling Example - 1
•
– Professors teach courses; each of the following situations concerns the Teaches relationship set.
– For each situation, draw an ER diagram that describes it (assuming no further constraints hold)
ER Modeling Example - 1
– Professors can teach the same course in several semesters, and each offering must be recorded
– Professors can teach the same course in several semesters, and only the most recent such offering needs to be recorded. (Assume this condition applies in all subsequent questions.)
– Every professor must teach some course
ER Modeling Example - 1
– Every professor teaches exactly one course (no more, no less)
– Every professor teaches exactly one course (no more, no less), and every course must be taught by some professor
– Now suppose that certain courses can be taught by a team of professors jointly, but it is possible that no one professor in a team can teach the course. Model this situation, introducing additional entity sets and relationship sets if necessary
Different ER Modeling Notations
Chen vs. Crow’s Foot Notation
Crow’s Foot Notation