Database Design Process THE AMERICAN UNIVERSITY IN CAIRO COMPUTER SCIENCE DEPARTMENT CSCI-253 FUNDAMENTALS OF DATABASE SYSTEMS DATABASE DESIGN PROCESS Dr. KHALIL (1) Database Design Process DATABASE DESIGN PROCESS Miniworld Requirements Collection and Analysis Functional Requirements Database Requirements Functional Analysis Conceptual Design High-level Transaction Specification Conceptual Schema (In a high-level data model) DBMS-independent DBMS-specific Application Program Design Logical Design (Data Model Mapping) Logical (Conceptual) Schema (In the data model of a specific DBMS) Physical Design Transaction Implementation Internal Schema (For the same DBMS) Application Programs (2) Database Design Process DATA MODELS Two data models are used commonly in the database design process: The Entity-Relational data model as a highlevel conceptual data model, and The Relational data implementation data model. model as an A database schema in the ER model Number Fname Minit Lname N 1 WORKS_FOR Name Location Name Address Sex SSN Salary NumerOfEmployee StartDate EMPLOYEE Bdate DEPARTMENT 1 1 1 MANAGES CONTROLS N Hours supervisor 1 supervisee SUPERVISION M N PROJECT WORKS_ON N 1 Name DEPENDENTS_OF Location Number N DEPENDENT Name Sex BirthDate Relationship (3) Database Design Process A database schema in the Relational model EMPLOYEE MINIT FNAME LNAME SSN BDATE ADDRESS SEX SALARY f. k. f. k. SUPERSSN DNO p. k. DEPARTMENT f. k. DNAME DNUMBER MGRSSN MGRSTARTDATE p. k. DEPT_LOCATIONS f. k. DNUMBER DLOCATION PROJECT f. k. PNAME PNUMBER PLOCATION DNUM p. k. p. k. WORKS_ON f. k. f. k. ESSN PNO p. k. HOURS f. k. DEPENDENT ESSN DEPENDENT NAME SEX BDATE RELATIONSHIP p. k. (4) Database Design Process Mapping from ER schema into Relational schema Mapping ER Schema Relational Schema (5) Database Design Process Entity-Relationship Data Model Case Study Consider the following data requirements for a COMPANY database: The company database keeps track of a company’s employees, departments, and projects. The following data requirements are collected: The company is organized into departments. Each department has a unique name, a unique number, and a particular employee who manages the department. The database keeps track of the start date when that employee began managing the department. A department may have several locations. A department controls a number of projects, each of which has unique name, a unique number, and a single location. The database stores each employee’s name, social security number, address, salary, sex, and birthdate. An employee is assigned to one department but may work on several projects, which are not necessarily controlled by the same department. The database keeps track of the number of hours per week that an employee works on each project. The database also keeps track of the direct supervisor of each employee. The database keeps track of the dependents of each employee for insurance purposes by storing each dependent’s name, sex, birthdate, and relationship to the employee. (6) Database Design Process Number Fname Minit Lname N 1 WORKS_FOR Name Location Name Address Sex SSN Salary NumerOfEmployee StartDate EMPLOYEE Bdate DEPARTMENT 1 1 1 MANAGES CONTROLS N Hours supervisor 1 supervisee SUPERVISION M N PROJECT WORKS_ON N 1 Name DEPENDENTS_OF Location Number N DEPENDENT Name Sex BirthDate Relationship ER schema diagram for the company database (7) Database Design Process ER Model Concepts Entity type Relationship type (8) Database Design Process (1) Entity Type Concept An Entity: An entity is a “thing” in the real world with an independent existence. That existence may be either: Physical existence, such as, a particular person, car, house, or employee, or Conceptual existence, such as a company, a job, or a university. Each entity has particular properties, called attributes, that describe it. For example, an employee entity may be described by the employee’s name, age, address, salary, and job. A particular entity will have a value for each of its attributes. Examples: e1 Name Sex Age = Adel Sadek = Male = 45 e2 Name Sex =Aly Sherif = Male (9) Database Design Process Age = 55 d1 Name Manager = Transport = Aly Sherif d2 Name Manager = Research = Adel Sadek p1 Name Location = ProjectX = Toshkie (10) Database Design Process An Entity Type: An entity type is the set of entities that have the same attributes. In the company database, we may distinguish the following entity types: Employee Department Project Dependent In another database, such as a database for a university, we may distinguish some entity types such as: College Department Student Course Teacher (11) Database Design Process Types of Attributes simple(atomic) versus single-valued versus stored versus composite multivalued derived Simple (atomic) Attribute which is not divisible. is an attribute Composite Attribute is an attribute which can be divided into smaller subparts, which represent more basic attributes with independent meaning of their own. (12) Database Design Process Single-valued Attribute is an attribute which will have a single value for a particular entity. Multivalued Attribute is an attribute which can have several values for a particular entity. __________________________________ Stored Attribute is an attribute whose values for different entities are stored directly in the database. Derived Attribute is an attribute whose value for a particular entity can be computed or derived from the value of one or more stored attributes for the same entity. ___________________________________ Null Attribute: In some cases a particular entity may not have any applicable value for an attribute. modified. Null can be used in two different situations: not applicable or unknown facts. Key Attribute of an Entity Type An entity type usually has an attribute whose values are distinct for each individual entity. Such an attribute - satisfying uniqueness constraint - is call a key attribute, and its values can be used to identify each entity uniquely. (13) Database Design Process (14) Database Design Process Modelling in ER model Phase 1 Initial Model Relationship Types Phase 2 Final Model Entity Types Entity Types (15) Database Design Process DEPARTMENT Name, Number, {Location}, Manager, ManagerStartDate PROJECT Name, Number, Location, ControllingDepartment EMPLOYEE Name (Fname, Minit, Lname), SSN, Sex, Address, Salary, BirthDate, Department, Supervisor, {WorksOn(Project, Hours)} DEPENDENT Employee, DependentName, Sex, Relationship Initial Model of the Company Database (16) Database Design Process ER Diagrams Symbol Meaning ENTITY TYPE WEAK ENTITY TYPE RELATIONSHIP TYPE IDENTIFYING RELATIONSHIP TYPE ATTRIBUTE KEY ATTRIBUTE MULTIVALUED ATTRIBUTE COMPOSITE ATTRIBUTE DERIVED ATTRIBUTE E1 E1 R 1 R R N (min, max) E2 TOTAL PARTICIPATION OF E2 IN R E2 CARDINALITY RATIO 1:N FOR E1:E2 IN R E STRUCTURAL CONSTRAINT (min, max) ON PARTICIPATION OF E IN R Initial ER diagram for the company database. (17) Database Design Process Fname Minit Lname Address Name Sex SSN EMPLOYEE Salary Birthdate Sex Key Location DEPARTMENT Department NumberOfEmployees Supervisor Employee Number Name Manager StartDate Name Name Number Location Department DEPENDENT Birthdate PORJECT Relationship (18) Database Design Process (2) Relationship Type Concept A relationship type R among n entity types E1, E2, ... , En defines a set of associations among entities from these entity types. Mathematically, R is a set of relationship instances ri, where each ri associates n entities (e1, e2, ..., en), and each entity ej in ri is a member of entity type Ej, 1 <= j <= n. Each of the entity types E1, E2, ..., En is said to participate in the relationship R, and similarly, each of the individual entities e1, e2, ..., en is said to participate in the relationship instance ri = (e1, e2, ..., en). EMPLOYEE e1 WORKS_FOR DEPARTMENT r1 d1 e2 e3 r2 d2 r3 e4 e5 e6 e7 r4 d3 r5 r6 r7 (19) Database Design Process Some instances of the WORKS_FOR relationship EMPLOYEE WORKS_FOR DEPARTMENT WORKS_FOR relationship type (20) Database Design Process Degree of a Relationship Type The degree of a relationship type is the number of participating entity types. Hence, the WORKS_FOR relationship type is of degree two. A relationship type of degree two is called binary, and one of degree three is called ternary. (21) Database Design Process An example of a ternary relationship type SUPPLIER s1 s2 SUPPLY PROJECT r1 j1 r2 j2 r3 r4 j3 PART r5 p1 r6 p2 p3 r7 (22) Database Design Process Structural Constraints 1. Cardinality ratio specifies the number of relationship instances that an entity is allowed to participate in. Common cardinality ratios for binary relationship types are: 1 : 1 (one to one) 1 : N (one to many) M : N (many to many) (23) Database Design Process MANAGES 1 EMPLOYEE M 1 DEPARTMENT N 1 WORKS_FOR 1 CONTROLS WORKS_ON N N PROJECT WORKS_FOR relationship type DEPARTMENT:EMPLOYEE of cardinality ratio 1:N. MANAGES relationship type DEPARTMENT:EMPLOYEE of cardinality ratio 1:1. CONTROLS relationship type DEPARTMENT:PROJECT of cardinality ratio 1:N. WORKS_ON relationship cardinality ratio M:N. type EMPLOYEE:PROJECT of (24) Database Design Process EMPLOYEE e1 MANAGES DEPARTMENT r1 d1 e2 e3 r2 d2 r3 e4 d3 e5 e6 e7 The 1:1 relationship MANAGES EMPLOYEE e1 e2 e3 WORKS_ON PROJECT r1 p1 r2 p2 r3 p3 e4 r4 p4 r5 r6 r7 The M:N relationship WORKS_ON (25) Database Design Process (26) Database Design Process 1. 2. Participation Constraint Specifies whether the existence of an entity depends on its being related to another entity via relationship type. There are two types of participation constraints: Partial Total (27) Database Design Process MANAGES 1 EMPLOYEE M 1 DEPARTMENT N 1 WORKS_FOR 1 CONTROLS WORKS_ON N N PROJECT Participation Constraint. The participation of WORKS_FOR is total. The participation MANAGES is total. of EMPLOYEE DEPARTMENT in in All other participations are partial. (28) Database Design Process Attributes of Relationship Types Relationship types can also have attributes, similar to those of entity types. MANAGES 1 1 EMPLOYEE DEPARTMENT StartDate M Hours WORKS_ON N PROJECT Attributes of 1:1 or 1:N relationship types can be migrated to one of the participating entity types. (29) Database Design Process WorkStartDate WORKS_FOR N 1 EMPLOYEE DEPARTMENT (a) DEPARTMENT (b) WorkStartDate WORKS_FOR N 1 EMPLOYEE An attribute of 1:N relationship type (a) equivalent to (b) EMPLOYEE M Hours WORKS_ON N PROJECT An attribute of M:N relationship type (30) Database Design Process Recursive Relationship Type A recursive relationship type is a relationship type associating entities of the same entity type. Role Names Each entity type that participates in a relationship type plays a particular role in the relationship. The role name signifies the role that a participating entity from the entity type plays in each relationship instance. (31) Database Design Process EMPLOYEE SUPERVISION e2 1 r2 2 1 e3 e4 r3 2 1 2 e5 e6 r1 2 e1 r4 1 1 2 r5 1 e7 2 r6 The recursive relationship type SUPERVISION. The EMPLOYEE plays two roles: Supervisor (1), and Supervisee (2). (32) Database Design Process Weak Entity Types Some entity types may not have any key attributes of their own; these are called weak entity types. Entities of a weak entity type are identified by being related to specific entities from another entity type in combination with some f their attribute values. That other entity type is called the identifying owner and the relationship type that relates a weak entity type to its owner is called identifying relationship of the weak entity type. A weak entity type has a total participation constraint with respect to its identifying relationship, because a weak entity cannot be identified without an owner entity. (33) Database Design Process EMPLOYEE 1 OWNS DependentName N Sex DEPENDENT Relationship BirthDate DEPENDENT as a weak entity type EMPLOYEE Dependent Relationship BirthDate DependentName Sex The weak entity type DEPENDENT as a composite multivalued attribute (34) Database Design Process XKey X 1 XY N YKey Y 1 YZ N ZKey Z Multi-level weak entity types (35) Database Design Process Refining the ER diagram This is the second and last stage in the ER data modelling process. In that stage, relationships are specified explicitly in the ER diagram. Structural constraints (cardinality ratio and participation) are added to each defined relationship type. Refining is done on the initial ER model, where several implicit relationships among the various entity types can be identified. In fact, whenever an attribute on entity type refers to another entity type, then this is representing an explicit relationship type among the two entity types. Applying these rules, the following relationship types are specified in the ER model of the company database: 1- 2- 3- MANAGES, a 1:1 relationship type between EMPLOYEE and DEPARTMENT. EMPLOYEE participation is partial. DEPARTMENT participation is not clear from the data requirements. We question the users, who say that a department must have a manager at all times, which implies total participation. The attribute StartDate is assigned to this relationship type. WORKS_FOR, a 1:N relationship type between DEPARTMENT and EMPLOYEE. Both participations are partial. CONTROLS, a 1:N relationship type between DEPARTMENT and PROJECT. Both participations are partial. (36) Database Design Process 4- 5- 6- SUPERVISION, a 1:N relationship type between EMPLOYEE (in the supervisor role) and EMPLOYEE (in the supervisee role). Both participations are determined to be partial. WORKS_ON, an M:N relationship type between EMPLOYEE and PROJECT. The attribute Hours is assigned to this relationship type. Both participations are determined to be partial. DEPENDENTS_OF, a 1:N relationship type between EMPLOYEE and DEPENDENT, which is also the identifying relationship for the weak entity type DEPENDENT. The participation of EMPLOYEE is partial, whereas that of DEPENDENT has to be total. Alternative ER Notation An alternative ER notation for specifying structural constraints involves associating a pair of integer numbers (min, max) with each participation of an entity type E in a relationship type R, where 0 min max and max 1 The numbers mean that, for each entity e in E , e must participate in at least min and at most max relationship instances in R at all times. In this method, min = 0 implies partial participation , whereas, min > 0 implies total participation. This method is more precise, and we can use it easily to specify structural constraints for relationship types of any degree. (37) Database Design Process (38) Database Design Process Number Fname Minit Lname Name Sex SSN WORKS_FOR (1,1) Address Salary (4,n) NumerOfEmployee StartDate Location Name DEPARTMENT (1,1) EMPLOYEE Bdate (0,n) (0,1) MANAGES CONTROLS (0,n) (0,1) (0,n) supervisor Hours (0,1) supervisee SUPERVISION (0,n) WORKS_ON (0,n) PROJECT Name DEPENDENTS_OF Number (1,1) DEPENDENT Name Sex BirthDate Relationship ER diagram for the COMPANY schema, with all role names included and with structural constraints on relationships specified using the alternate notation (min, max). 39 Location