CS 405G Introduction to Database Systems Lecture 3: ER Model Instructor: Chen Qian Review A database is A miniworld is some aspect of the real word, described by facts (data) A data model is a large collection of integrated data a group of concepts for describing data. Which one(s) is a data model ER model Relation model Objected oriented programming model 7/1/2016 Chen Qian Univ. of Kentucky 2 Today’s Outline Database design ER Model ER Diagrams – Notation Database schema Case studies 7/1/2016 Chen Qian Univ. of Kentucky 3 Database Design Requirement Analysis: Understand the mini-world being modeled Conceptual Database Design: Specify it using a database design model A few popular ones are: Entity/Relationship (E/R) model (in this course) UML (Unified Modeling Language) Intuitive and convenient Logical Design: Translate specification to the data model of DBMS Relational, XML, object-oriented, etc. 7/1/2016 Chen Qian Univ. of Kentucky 4 Database Design 7/1/2016 Chen Qian Univ. of Kentucky 5 An Database Design Example The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. Each department controls a number of PROJECTs. Each project has a name, number and is located at a single location. We store each EMPLOYEE’s social security number, address, salary, sex, and birthdate. Each employee works for one department but may work on several projects. We keep track of the number of hours per week that an employee currently works on each project. We also keep track of the direct supervisor of each employee. Each employee may have a number of DEPENDENTs. For each dependent, we keep track of their name, sex, birthdate, and relationship to employee. 7/1/2016 Chen Qian Univ. of Kentucky 6 7/1/2016 Luke Huan Univ. of Kansas 7 Entity-relationship (E/R) model Historically and still very popular Can think of as a “watered-down” object-oriented design model Primarily a design model—not directly implemented by DBMS Designs represented by E/R diagrams there are other styles 7/1/2016 Chen Qian Univ. of Kentucky 8 Entities and Attributes Entity: A specific object or “thing” in the mini-world that is represented in the database. For example, the EMPLOYEE John Smith, the Research DEPARTMENT, the iPhone8 PROJECT. Attributes: properties used to describe an entity. For example, an EMPLOYEE entity may have a Name, SSN, Address, Sex, BirthDate A specific entity will have a value for each of its attributes. For example, a specific employee entity may have Name='John Smith', SSN='123456789', Address ='731 Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55' 7/1/2016 Chen Qian Univ. of Kentucky 9 Types of Attributes Simple vs. Composite Attributes Simple: Each entity has a single atomic value for the attribute. For example, SSN or Sex. Composite: The attribute may be composed of several components. For example, Name (FirstName, MiddleName, LastName). 7/1/2016 Chen Qian Univ. of Kentucky 10 Types of Attributes (cont.) Single-valued vs. Multi-valued. Single-valued: an entity may have at most one value for the attribute Multi-valued: An entity may have multiple values for that attribute. For example, PreviousDegrees of a STUDENT. {PreviousDegrees}. NULL values What if the student does not hold a previous degree? What if the student has a previous degree but the information is not provided? Apartment number in an address 7/1/2016 Chen Qian Univ. of Kentucky 11 Types of Attributes (cont.) Stored vs. derived Number of credit hours a student took in a semester GPA of a student in a semester 7/1/2016 Chen Qian Univ. of Kentucky 12 Key Attributes Entities with the same basic attributes are grouped or typed into an entity type (or entity set). For example, the EMPLOYEE entity type or the PROJECT entity type. An attribute of an entity type for which each entity must have a unique value is called a key attribute (or key, in short) of the entity type. For example, SSN of EMPLOYEE. 7/1/2016 A key attribute may be composite. An entity type may have more than one key. Chen Qian Univ. of Kentucky 13 SUMMARY OF ER-DIAGRAM NOTATION Symbol Meaning ENTITY TYPE ATTRIBUTE KEY ATTRIBUTE MULTIVALUED ATTRIBUTE COMPOSITE ATTRIBUTE DERIVED ATTRIBUTE 7/1/2016 Chen Qian Univ. of Kentucky 14 Summary (cont.) 7/1/2016 Chen Qian Univ. of Kentucky 15 Summary (cont.) John Smith is an entity. EMPLOYEE is an entity type. SSN is a key attribute. The SSN of John, 123-45-6789, is a value of the attribute 7/1/2016 Chen Qian Univ. of Kentucky 16 Relationships A relationship relates two or more distinct entities with a specific meaning. For example, EMPLOYEE John Smith works on the iPhone8 PROJECT or EMPLOYEE Franklin Wong manages the Research DEPARTMENT. Relationships of the same type are grouped or typed into a relationship type (or relationship set). For example, the WORKS_ON relationship type in which EMPLOYEEs and PROJECTs participate, or the MANAGES relationship type in which EMPLOYEEs and DEPARTMENTs participate. 7/1/2016 Chen Qian Univ. of Kentucky 17 Relationships The degree of a relationship type is the number of participating entity types. Both MANAGES and WORKS_ON are binary relationships. Name a 3-degree relationship? John works on the iPhone8 project since 2010. 7/1/2016 Chen Qian Univ. of Kentucky 18 Instances of a relationship: a set of relationships EMPLOYEE WORKS_FOR e1 r1 e2 e3 e4 e5 r2 DEPARTMENT d1 d2 r3 r4 e6 r5 e7 r6 d3 r7 7/1/2016 Chen Qian Univ. of Kentucky 19 Summary John Smith work on the iPhone8 project is a relationship. WORKS_ON is an relationship type. The set of WORKS_ON relationships of all current employees and departments is an instance of WORKS_ON 7/1/2016 Chen Qian Univ. of Kentucky 20 Key constraints One-to-one (1:1) MANAGE One-to-many (1:N) or Many-to-one (N:1) DEPENDENT_OF Many-to-many WORKS_ON 7/1/2016 Chen Qian Univ. of Kentucky 21 Many-to-one (N:1) RELATIONSHIP EMPLOYEE WORKS_FOR e1 r1 e2 r2 e3 r3 DEPARTMENT d1 d2 e4 e5 r4 e6 r5 e7 r6 d3 r7 7/1/2016 Chen Qian Univ. of Kentucky 22 Many-to-many (M:N) RELATIONSHIP EMPLOYEE WORKS_ON PROJECT r9 e1 r1 e2 r2 e3 p1 p2 r3 e4 r4 e5 e6 r5 e7 r6 p3 r7 r8 7/1/2016 Chen Qian Univ. of Kentucky 23 More Examples Each student may have exactly one account. Each faculty may teach many courses Each student may enroll many courses Students Courses Students 7/1/2016 Own UK Accounts TaughtBy Enroll Instructors Courses Chen Qian Univ. of Kentucky 24 The (min,max) notation relationship constraints 7/1/2016 (0,1) (1,1) (1,1) (1,N) Chen Qian Univ. of Kentucky 25 Participation constraints total participation All entities of an entity type participate in the relationship type. Example: The participation of EMPLOYEE in WORKS_IN. partial participation A participation that is not total Example: The participation of EMPLOYEE in MANAGE. 7/1/2016 Chen Qian Univ. of Kentucky 26 The (min,max) notation relationship constraints 7/1/2016 (0,1) (1,1) (1,1) (1,N) Chen Qian Univ. of Kentucky 27 Recursive relationship We can also have a recursive relationship type. Both participations are same entity type in different roles. For example, SUPERVISION relationships between EMPLOYEE (in role of supervisor or boss) and (another) EMPLOYEE (in role of subordinate or worker). In ER diagram, need to display role names to distinguish participations. 7/1/2016 Chen Qian Univ. of Kentucky 28 A RECURSIVE RELATIONSHIP SUPERVISION EMPLOYEE SUPERVISION e1 2 1 e2 1 e3 e4 e7 2 r2 2 1 r3 2 e5 e6 r1 1 1 2 r4 1 r5 2 r6 7/1/2016 Chen Qian Univ. of Kentucky 29 Weak Entity Types A week entity is an entity that does not have a key attribute A weak entity must participate in an identifying relationship type with an owner Entities are identified by the combination of: A partial key of the weak entity type The particular owner they are related to in the owner type 7/1/2016 Chen Qian Univ. of Kentucky 30 7/1/2016 Luke Huan Univ. of Kansas 31 Weak Entity Types Example: Suppose that a DEPENDENT entity is identified by the dependent’s first name and birthdate, and the specific EMPLOYEE that the dependent is related to. DEPENDENT is a weak entity type with EMPLOYEE as its owner via the identifying relationship type DEPENDENT_OF An dependent of John Smith, Mary, is identified by her first name Mary, birthdate 1/1/1980, and the entity John. 7/1/2016 Chen Qian Univ. of Kentucky 32 SUMMARY OF ER-DIAGRAM NOTATION FOR ER SCHEMAS Symbol Meaning ENTITY TYPE WEAK ENTITY TYPE RELATIONSHIP TYPE IDENTIFYING RELATIONSHIP TYPE ATTRIBUTE KEY ATTRIBUTE MULTIVALUED ATTRIBUTE COMPOSITE ATTRIBUTE DERIVED ATTRIBUTE E1 E1 R R 7/1/2016 E2 R N (min,max) TOTAL PARTICIPATION OF E2 IN R E2 CARDINALITY RATIO 1:N FOR E1:E2 IN R E Key CONSTRAINT (min, max) ON PARTICIPATION OF E IN R Chen Qian Univ. of Kentucky 33 More ER Attribute of a relationship type: To describe some properties of the relationship. since name ssn Employees 7/1/2016 dname lot did Works_In Chen Qian Univ. of Kentucky budget Departments 34 More ER An arrow indicates that given an entity, we can uniquely determine the other entity of the relationship. dependent (1,1) IS_DEPEN (1,1) department (0,N) (0,1) MANAGE employee employee 35 ER-Design Principles Avoid redundancy. Limit the use of weak entity sets. Don’t use an entity set when an attribute will do. Limit the use of multivalued or composite attributes. 36 Avoiding Redundancy Redundancy occurs when we say the same thing in two different ways. Redundancy wastes space (more importantly) encourages inconsistency. The two instances of the same fact may become inconsistent if we change one and forget to change the other, related version. 37 Example of inconsistency Suppose in a DB, a student is stored as an entity with an attribute advisor, a professor is also stored as an entity with a multi-valued attribute advisees. So…. If a student changes her advisor, and we only make changes of the student entity. Inconsistency happens because the professor entity still includes her as an advisee. The right way is: 7/1/2016 Chen Qian Univ. of Kentucky 38 Consistent ER name student name advised professor 39 More Example Assume we would like to set up a database for a wine distribution company. Wine Manufacture 40 Example name Beers name ManfBy addr Manfs manf 41 Example: redundancy name Beers name ManfBy addr Manfs manf This design states the manufacturer of a beer twice: as an attribute and as a related entity. 42 Example name manf manfAddr Beers 43 Example: redundancy and losing information name manf manfAddr Beers This design repeats the manufacturer’s address once for each beer; loses the address if there are temporarily no beers for a manufacturer. 44 Example name Beers name ManfBy addr Manfs 45 Example name Beers name ManfBy addr Manfs This design gives the address of each manufacturer exactly once. 46 Don’t Overuse Weak Entity Sets Beginning database designers often doubt that anything could be a key by itself. They make all entity sets weak, supported by all other entity sets to which they are linked. In reality, we usually create unique ID’s for entity sets. Examples include social-security numbers, automobile VIN’s etc. 47 When Do We Need Weak Entity Sets? The usual reason is that there is no global authority capable of creating unique ID’s. Example it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world. 48 Limit the use of multivalued or composite attributes Because they can be converted to entities, which are easier to manage name addr Manfs beer 7/1/2016 Chen Qian Univ. of Kentucky 49 Limit the use of multivalued or composite attributes Because they can be converted to entities, which are easier to manage name Beers 7/1/2016 name ManfBy Chen Qian Univ. of Kentucky addr Manfs 50 Decision making 7/1/2016 Entity vs. Attribute Chen Qian Univ. of Kentucky 51 Entity Sets Versus Attributes An entity set should satisfy at least one of the following conditions: It is more than the name of something; it has at least one non-key attribute. or It is the “many” in a many-one or many-many relationship. 52 Example: Bad name Beers name ManfBy Manfs 53 Example: Bad name Beers name ManfBy Manfs Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set. 54 Example: Good name manf Beers There is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name. 55 Example: Good name Beers name ManfBy addr Manfs •Manfs deserves to be an entity set because of the non-key attribute addr. •Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy. 56 ER Case Study I Works_In does not allow an employee to work in a department for two or more periods. We want to record several values of the descriptive attributes for each instance of this relationship. 7/1/2016 from name ssn to dname lot did Departments Works_In Employees budget name dname ssn lot Employees from did Works_In Duration budget Departments to 57 57 ER Case Study I We want to record several values of the descriptive attributes for each instance of this relationship. name dname ssn lot Employees from did Works_In budget Departments Duration to Why not name ssn lot Employees from 7/1/2016 dname did Works_In budget Departments to 58 ER Case study II Design a database representing cities, counties, and states For states, record name and capital (city) For counties, record name, area, and location (state) For cities, record name, population, and location (county and state) Assume the following: 7/1/2016 Names of states are unique Names of counties are only unique within a state Names of cities are only unique within a county A city is always located in a single county A county is always located in a single state 59 Case study : first design County area is repeated for every city in the county State capital should really be a city Redundancy is bad. What else? Should “reference” entities through explicit relationships I didn’t say a city name is unique within a state! name name Cities population In States capital county_name county_area 7/1/2016 60 Case study : second design Technically, nothing in this design could prevent a city in state X from being the capital of another state Y, but oh well… name Cities population In IsCapitalOf name Counties In States name area 7/1/2016 61 Summary of Conceptual Design Conceptual design follows requirements analysis, ER model popular for conceptual design Yields a high-level description of data to be stored Constructs are expressive, close to the way people think about their applications. Basic constructs: entities, relationships, and attributes (of entities and relationships). Note: There are many variations on ER model. Summary of ER (Contd.) ER design is subjective. There are often many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include: Entity vs. attribute, entity vs. relationship, binary or n-ary relationship. Homework Quiz (close book) Beginning of next class 10 mins Project 7/1/2016 Find your collaborator 64