Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002-2008 Introduction to Conceptual Database Design These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db 1 Overview of Lecture Entity Classes Relationships & ER Diagrams 1:M Relationship Design Multiple Relationships & The Fan Traps Conceptual Design Other ER Models Mandatory Participation Reflexive 1:M Relationships Class Identification & Surrogate Keys Redundancy & Anomalies Simple Functional Dependencies Simple Conceptual Normalization © Ellis Cohen 2001-2008 2 Entity Classes © Ellis Cohen 2001-2008 3 Conceptual Modeling Conceptual Modeling is a way of designing systems involving collections of tables by focusing on • Entities – an abstraction of tuples • Entity Classes – an abstraction of tables • Relationships – between entities in different entity classes Using diagrams called ER diagrams (or Entity-Relationship Diagrams) © Ellis Cohen 2001-2008 4 Tables as Themes Employees empno ename sal comm 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7698 BLAKE 2850 7839 KING 5000 7844 TURNER 1500 7986 STERN 1500 0 A row represents a single Employee Every table has a theme -- e.g. Employees Every row represents an instance of that theme -- e.g. a single Employee © Ellis Cohen 2001-2008 5 Columns as Attributes Employees Primary Key is underlined Uniquely identifies an employee empno ename sal comm 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7698 BLAKE 2850 7839 KING 5000 7844 TURNER 1500 7986 STERN 1500 0 Every column represents an attribute related to the theme -- e.g. the name or salary of an Employee © Ellis Cohen 2001-2008 6 Rows as Objects/Entities Employees empno ename sal comm 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7698 BLAKE 2850 7839 KING 5000 7844 TURNER 1500 7986 STERN 1500 an Employee Object empno: 7654 ename: MARTIN sal: 1250 comm: 1400 0 It can be useful to think of each row as an object or entity and the table as a collection of these entities. The columns of the table correspond to the instance variables for each object © Ellis Cohen 2001-2008 7 Entity Classes In conceptual modeling, we focus on the entity class which represents the class of entities with the same theme. In general (but as we will see, not always), an entity class is implemented by a table in a relational database © Ellis Cohen 2001-2008 8 Modeling Entity Classes Visual Conceptual Model (Crow Magnum) Sometimes we don't Sometimes we include all the attributes Employee Employee empno ename sal comm Sometimes we just include the primary key Employee empno Textual Conceptual Model (Brief ConText) Employee( empno, ename, sal, comm ) © Ellis Cohen 2001-2008 9 Attributes Types Keep attribute types simple Complex attribute types often mean you need to rethink your design or be more specific Employee empno ename sal comm a number a string a dollar amount a dollar amount © Ellis Cohen 2001-2008 10 Relationships and ER Diagrams © Ellis Cohen 2001-2008 11 ER (Entity-Relationship) Diagrams (Crow Magnum style) Depicts a relationship between Employees and Depts Employee empno ename sal comm relationship characterization works for Dept deptno dname Crows Foot The Crow's foot at Employee means … • A Dept can have MANY Employees No Crow's foot at Dept, so … • An Employee works for no more than ONE Dept © Ellis Cohen 2001-2008 12 ER & Instance Diagrams ER Diagram works for Employee Dept Relationship Entity Class Entity Class * Corresponds to links between instances of the related classes Instance Diagram Shows example instances and the links between them 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN 1500 Entity Instances Links © Ellis Cohen 2001-2008 10 SALES 30 ACCOUNTING Entity Instances 13 Instance Diagrams & Navigation Links 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN 1500 10 SALES 30 ACCOUNTING Links are the basis of navigation between instances. We will see later that there are SQL queries which effectively find the tuples in one entity class which are related to the tuples in another entity class. So we can write SQL to find information about • The dept associated with an employee • All the employees who work for a department © Ellis Cohen 2001-2008 14 Entities & Links as Persistent Data Every – entity instance of an entity class – link instance of a relationship Represents – data persistently stored in a database – information needed to answer a query (e.g. what’s the salary of ALLEN?, what department does ALLEN work in?) The ONLY reason to represent an entity class or a relationship in a conceptual model is because the requirements clearly indicate they are necessary to provide information needed for some query © Ellis Cohen 2001-2008 15 1:M (1 to Many) Relationships Child Entity Class Employee Parent Entity Class works for Dept deptno empno an Employee works for (at most) 1 Dept a Dept has any number (i.e. M) of Employees © Ellis Cohen 2001-2008 16 Easy Crow Magnum Relationships Visual Conceptual Model (Easy Crow Magnum) Employee Employee empno works for works for Dept Dept deptno no attributes shown in this example just primary keys shown in this example Easy Crow Magnum is meant for quickly drawing designs on paper or a whiteboard In Easy Crow Magnum, don't draw the box outlines on entity classes © Ellis Cohen 2001-2008 17 Visual & Textual Models for Relationships VISUAL Conceptual Model (Crow Magnum) Employee works for Dept relationship characterization TEXTUAL Conceptual Model (Brief ConText) (*) Employee works for Dept relationship characterization © Ellis Cohen 2001-2008 18 Inverse Relationships VISUAL Conceptual Model (Crow Magnum) Dept contains Employee relationship characterization TEXTUAL Conceptual Model (Brief ConText) Dept contains (*) Employee relationship characterization © Ellis Cohen 2001-2008 19 Indicating Relationship Direction Employee Employee Employee works for works for contains Dept Dept Dept Dept Dept Dept contains contains works for Employee Employee Employee Must indicate the relationship direction if it is not the natural reading direction (L-to-R, top-to-bottom) © Ellis Cohen 2001-2008 20 Naming Relationships It is useful, though not required, to name relationships in ConText Entity Classes Employee( empno, ename, addr ) Dept( deptno, dname ) Relationship name Relationships Relationship characterization DeptAssignment: (*) Employee works for Dept Choose a relationship name so that it clearly and uniquely identifies the relationship © Ellis Cohen 2001-2008 21 M:N (Many-to-Many) Relationships Visual Conceptual Model (Crow Magnum) Employee assigned to Project Each employee may be assigned to a number of projects Each project may have a number of employees Textual Conceptual Model (Brief ConText) (*) Employee assigned to (*) Project © Ellis Cohen 2001-2008 22 M:N Related Instances assigned to Employee Project empno empno ename pno address 7499 ALLEN ... 7654 MARTIN ... 7698 BLAKE ... 7839 KING ... 7844 TURNER ... 7986 STERN ... pno pname … 2618 … 2621 … 2622 … © Ellis Cohen 2001-2008 23 1:1 (One-to-One) Relationships Visual Conceptual Model (Crow Magnum) Desk assigned to Employee Each employee may be assigned to at most one desk Each desk may be assigned to at most one employee Textual Conceptual Model (Brief ConText) Desk assigned to Employee © Ellis Cohen 2001-2008 24 1:1 Related Instances assigned to Employee Desk empno empno ename deskno address 7499 ALLEN ... 7654 MARTIN ... 7698 BLAKE ... 7839 KING ... 7844 TURNER ... 7986 STERN ... deskno … © Ellis Cohen 2001-2008 311 … 312 … 313 … 25 1:M Relationship Design © Ellis Cohen 2001-2008 26 1:M Relationship Exercise Come up with an Easy Crow Magnum ER Diagram of a 1:M Relationship between two entity classes (not a 1:1 or M:N relationship, and not Dept/Employee) Employee works for empno ename sal Dept deptno dname The diagram must 1. 2. 3. 4. show the name of each entity class show main attributes of each entity class include the primary key (and underline it) show the relationship characterization © Ellis Cohen 2001-2008 27 Choosing Relationship Characterizations Employee Employee has works for Dept Dept Dept Dept has employs Employee Employee ER diagrams are meant for communicating designs as clearly as possible. It is worth taking the time to choose the best possible relationship characterization. © Ellis Cohen 2001-2008 28 Mistaking 1:M for M:N Person likes Ice Cream Flavor persid flavid It is common to consider relationships from only one perspective. Think about a person. They like many ice cream flavors. Must be 1:M? Person likes Ice Cream Flavor persid flavid But in fact, many persons like the same ice cream flavor, so it is actually M:N. Specifying the relationship characterization is crucial. The "favorite flavor" relationship is 1:M, but in the other direction! Person favorite flavor Ice Cream Flavor persid flavid © Ellis Cohen 2001-2008 29 Choosing the Wrong Key Person owns Book persid title This case is somewhat similar to the previous one. A person owns many books, each one identified by their title. So, it must be 1:M. In fact, may people own a book with a particular title (e.g. there are lots of copies of "Gone with the Wind"). The problem is that the wrong primary key has been used. Imagine that every copy of every book ever published was given a unique serial number which uniquely identifies a single book instance. That's what would be needed to really have a 1:M relationship! Person owns persid Book serialnum © Ellis Cohen 2001-2008 30 Singleton Classes DON'T DO THIS Boston University owns Building bldgno DO THIS INSTEAD Building bldgno OR THIS University owns name Building bldgno Boston University is a singleton entity class, which only has a single entity (i.e. a single tuple) in it. Either leave it out entirely, or replace it with a more general entity class © Ellis Cohen 2001-2008 31 Singleton Classes & M:N Relationships DON'T DO THIS Microsoft uses Building bldgno DO THIS INSTEAD Building bldgno OR THIS Company uses Building name bldgno Generalizing singleton entity classes can result in M:N relationships © Ellis Cohen 2001-2008 32 Entity Classes vs. Attributes DON'T DO THIS Employee makes empno ename DO THIS INSTEAD Salary salary Employee empno ename salary Don't create an entity class for something that really does not need to have a "life of its own", but ought to simply be an attribute of another entity class © Ellis Cohen 2001-2008 33 Attribute vs. Entity Class Principles Employee empno ename deptno Employee empno ename Dept deptno Reasons for upgrading attributes to entity classes: 1. Substance: It emphasizes that a department is something that can and should stand in its own right 2. Extensibility: One might want to add attributes specific to a department (e.g. its name, location, etc.) 3. Multiplicity: It emphasizes that a department can have multiple employees associated with it 4. Association: It emphasizes that an employees cannot have an arbitrary deptno value, but that the employee is associated with a department which has a specific deptno Note that none of these reasons make sense for Salary (though multiplicity suggests that it might be useful to add a PayGrade Class, and make salary an attribute of a PayGrade) © Ellis Cohen 2001-2008 34 Break Up Complex Attributes Employee empno ename assignments Employee Do NOT use attributes that contain complex sets of details. Break them up into additional entity classes! Assignment empno ename Moreover, this should probably be replaced by a relationship with a Project entity class © Ellis Cohen 2001-2008 assnid projnam rate startdate 35 Entity Attributes & Relationships works for Employee Does NOT include deptno empno ename addr deptno * Dept deptno dname Employee does not contain an attribute that identifies the Dept it is associated with (i.e. deptno). An employee is certainly associated with a department – but that is represented by the relationship between Employee and Dept. A deptno attribute in Employee would be an entity attribute (it identifies a Dept entity). This would not only be redundant (with the works for relationship), but wrong to include in a Conceptual Model [no conceptual foreign keys] © Ellis Cohen 2001-2008 36 Relationships & Persistence • A relationship represents information which needs to be persistently stored in a database! • If the information doesn't need to be stored and queried, don't represent it as a relationship * • Don't include relationships which simply show what a user can do or keep track of what a user has done, unless it is clear that information will be needed later! © Ellis Cohen 2001-2008 37 Multiple Relationships & The Fan Traps © Ellis Cohen 2001-2008 38 Multiple Relationships Team Team Coach has Player has Player works for Team has enrolled in has Child Health Plan Player What do these diagrams mean? © Ellis Cohen 2001-2008 39 The Fan Trap Suppose there are multiple divisions in a company, each divided into departments Every employee works for a division (and is assigned to a particular department in that division) What's wrong with the diagram below? Employee works for divided into Division © Ellis Cohen 2001-2008 Dept 40 Fan Trap Instances 7499 ALLEN 7654 MARTIN 7844 TURNER 7698 BLAKE 7986 … 10 SALES 30 ACCOUNTING DIV A STERN DIV B … It is impossible to determine which department an employee is assigned to! © Ellis Cohen 2001-2008 41 The Reverse Fan Trap Suppose a company has multiple divisions Every employee is employed by a division, and assigned to a particular department (in that division) What's wrong with this diagram? works for employs Division divno Employee empno © Ellis Cohen 2001-2008 Dept deptno 42 Reverse Fan Trap Instances DIV A DIV B 7499 ALLEN 7654 MARTIN 7844 TURNER 7698 BLAKE 7986 STERN … 10 SALES 30 ACCOUNTING … Two employees in the same department could be assigned to different divisions Is there any way to prevent this when using this model? © Ellis Cohen 2001-2008 43 Business Rules & The Reverse Fan Trap works for employs Division divno Employee Dept empno deptno + Two employees who are in the same department must be in the same division We can prevent this problem by adding the business rule above! But how would this be enforced? Well, we'll see that we can write code that detects every time a change is made to the data in the database which might violate this business rule, and then ensures that the business rule is enforced! Are there any other problems with this model? © Ellis Cohen 2001-2008 44 Deletion Anomalies & The Reverse Fan Trap DIV A DIV B 7499 ALLEN 7654 MARTIN 7844 TURNER 7698 BLAKE 7986 STERN … 10 SALES 30 ACCOUNTING … Suppose STERN is the only employee in department 30. If STERN is terminated, there is no longer any way, to determine that dept 30 is in division B! So maybe we should try another model? © Ellis Cohen 2001-2008 45 Resolving the Fan Traps Employee 7499 ALLEN 7654 MARTIN 7844 TURNER 7698 BLAKE 7986 STERN works for Dept 10 part of Division SALES DIV A 30 ACCOUNTING … DIV B … It is now possible to determine each employee's department & each department's division! We can also still tell which division an employee is assigned to, by following the link from the employee to the dept, and then from the dept to the division © Ellis Cohen 2001-2008 46 Conceptual Design © Ellis Cohen 2001-2008 47 Database Design Levels Conceptual Design / Model Model of the database design in terms that users will understand Logical Design / Model Description of the design in terms that can be directly used to build a database (This is called a Relational Model, if we are building an RDB) Physical Design / Model Additional design descriptions that specify or affect the data representation in physical storage © Ellis Cohen 2001-2008 48 Database Design Process Requirements Conceptual Design Conceptual Model A model used for communication with system analysts and UI designers Relational Mapping Actual Design of Tables in the Database Relational Model Physical Mapping Physical Model using DDL & DCL © Ellis Cohen 2001-2008 49 Conceptual Modeling Conceptual Modeling (also known as Conceptual Design) starts with two activities • Identify Entity Classes • Identify Relationships between them © Ellis Cohen 2001-2008 50 Identify Entity Classes Identify the classes of entities called for to support requirements Think of an entity as – a readily identifiable thing A potential entity class should have 1. Multiple instances (i.e. the requirements should imply the need for more than one entity of that class) 2. A clear set of attributes based on the requirements 3. A primary key to uniquely identify tuples © Ellis Cohen 2001-2008 51 Design Exercise A university is divided into departments. Each department is made up of faculty members. Each department may have a number of degree-granting programs. Every student may only declare a major in a single program. Every student may be assigned a faculty member as an advisor. What are the entity classes needed? © Ellis Cohen 2001-2008 52 Entity Classes Needed University: MAYBE, but it sounds like it's a singleton class Department: YES Faculty Member: YES Student: YES Person: MAYBE (ignore for now) (Degree-Granting) Program: YES Major: MAYBE (but it is simpler to just treat major as a relationship) between a student and a program Advisor: MAYBE (but it is simpler to treat advise as a relationship between Faculty Member and Student (also, Advisor is a subclass of Faculty Member, which is a bit too complicated to get into just yet) © Ellis Cohen 2001-2008 53 Identify Relationships If the database needs to persistently keep track of a link/association between an entity of one class and an entity of another class The entity classes are related to one another Draw the Easy Crow Magnum diagram based on the identified entity classes. All 1:M Relationships! © Ellis Cohen 2001-2008 54 Design Solution offers Department made up of Faculty Member Program majors in advises Student Department Faculty Member Faculty members sometimes have appointments in more than one department. The relationship would be 1:M if the requirement were " Each department is made up of faculty members; and each faculty member is only in one department, at most" © Ellis Cohen 2001-2008 55 Relationship Exercise Bikeshop has in stock Bike sells instance of Bike Model part of Part used on instance of Part Model The model above describes the parts on bikes sold by a chain of bike-shops (i.e. there are multiple bike-shops). It contains both 1:M and M:N relationships, but the Crow's Feet have all been removed. Put them back where they belong! © Ellis Cohen 2001-2008 56 Relationship Exercise Answer Bikeshop has in stock Bike sells instance of used on part of Part Bike Model instance of © Ellis Cohen 2001-2008 Part Model 57 Uses of 1:M Relationships 1. Relationship between independent entity classes Employee enrolled in Dental Plan 2. Aggregation (Part/Whole) Relationship Part part of Bike 3. Instance/Category Relationship Part instance of © Ellis Cohen 2001-2008 Part Model 58 Entity Class Specificity Whenever we design entity classes, it is important to think VERY CAREFULLY about its specificity When designing the entity class Auto, we could mean – A kind of auto: Individual entities would be identified by model and year: e.g. 1984 Honda Accord. Better to name this class Auto Model – A specific auto: Individual entities would be identified by their VIN (vehicle identification number): e.g. 614HT37994PL7394, but might also have a model and year (or maybe not) Sometimes a design will need multiple levels of specificity (related by a 1:M relationship – e.g. Bike & Bike Model). © Ellis Cohen 2001-2008 59 Other ER Models © Ellis Cohen 2001-2008 60 Chen 1:M ER Model Child Entity Class Crow Magnum Employee empno ename addr Chen Employee Parent Entity Class works for works for Dept deptno dname Dept Key constraint: The primary key of Employee not only uniquely identifies an employee (and their name & addr), but also uniquely identifies their associated department (its deptno & dname) © Ellis Cohen 2001-2008 61 UML 1:M ER Model Child Entity Class Crow Magnum Parent Entity Class works for Employee Dept Associations (i.e. relationships) reflect the associations that one entity should or needs to have with other entities. A department needs to be associated with multiple (*) employees – all the employees that work for it UML Employee * works for 0..1 Dept An employee only needs to be associated with, at most, a single department – the department it is in © Ellis Cohen 2001-2008 62 UML with Attributes Crow Magnum works for Employee Dept empno ename addr UML Employee PK * deptno dname works for empno ename addr 0..1 Dept PK deptno dname Using UML's icon notation for PK © Ellis Cohen 2001-2008 63 Primary Key Representations in UML Dept UML «PK» deptno dname Use UML's icon notation. An icon is a predefined arbitrary graphical PK symbol, in this case, to be used in place of a stereotype Dept UML PK deptno dname Dept UML Use UML's stereotype notation, which is a way of associating domain-specific characteristics, enclosed in «guillemets», with a UML element deptno {PK} dname Use UML's property notation, which is a way of associating properties, enclosed in {curly braces}, with a UML element © Ellis Cohen 2001-2008 64 UML Aggregation Crow Magnum part of Part Bike When a 1:M relationship is a part/whole relationship (e.g. a bicycle part is a part of a bike), then UML uses a special aggregation symbol to depict it UML Part * 0..1 Bike In rare circumstances, UML also allows aggregation with M:N relationships as well © Ellis Cohen 2001-2008 65 1:M Relationships Crow Magnum Easy Crow Magnum UML Chen works for Employee works for Employee Employee Employee Dept * works for works for © Ellis Cohen 2001-2008 Dept 0..1 Dept Dept 66 1:1 Relationships Crow Magnum Easy Crow Magnum UML Chen sits at Employee sits at Employee Employee Employee Desk 0..1 sits at sits at © Ellis Cohen 2001-2008 Desk 0..1 Desk Desk 67 M:N Relationships Crow Magnum Easy Crow Magnum UML Chen assigned to Employee Project assigned to Employee Employee * assigned to assigned to Employee Project * Project Project No key constraints! © Ellis Cohen 2001-2008 68 Mandatory Participation © Ellis Cohen 2001-2008 69 Child Participation Child Entity Class Employee Parent Entity Class works for Dept Mandatory: An Employee MUST work for 1 Dept Employee works for Dept Indeterminate Deferred participation design decision © Ellis Cohen 2001-2008 70 Mandatory Child Participation (Every employee is assigned to 1 department) Child Entity Class Employee Parent Entity Class works for 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN Dept 10 SALES 30 ACCOUNTING 1500 Every Employee participates in a relationship with a Dept © Ellis Cohen 2001-2008 71 Non-Mandatory Child Participation (There may be employees with no dept) Child Entity Class Employee Parent Entity Class works for 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN Dept 10 SALES 30 ACCOUNTING 1500 There are Employees who don't participate in a relationship with a Dept © Ellis Cohen 2001-2008 72 Mandatory Child Participation in UML Child Entity Class UML Employee Parent Entity Class * works for 1 Dept an Employee must be assigned to 1 Dept Crow Magnum Employee works for © Ellis Cohen 2001-2008 Dept 73 Parent Participation Child Entity Class Parent Entity Class works for Employee Dept Mandatory: MUST be 1 Employee in every Dept Employee works for Dept Indeterminate: Deferred participation design decision © Ellis Cohen 2001-2008 74 Mandatory Parent Participation (Every department has at least 1 employee) Child Entity Class Employee Parent Entity Class works for 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN Dept 10 SALES 30 ACCOUNTING 1500 Every Dept participates in a relationship with an Employee © Ellis Cohen 2001-2008 75 Non-Mandatory Parent Participation (There may be depts with no employees) Child Entity Class Parent Entity Class works for Employee 7499 ALLEN 1600 300 7654 MARTIN 1250 1400 7844 TURNER 1500 0 7698 BLAKE 2850 7986 STERN Dept 10 SALES 20 RESEARCH 30 ACCOUNTING 1500 There are Depts who don't participate in a relationship with Employees © Ellis Cohen 2001-2008 76 Mandatory Parent Participation in UML Child Entity Class UML Employee Parent Entity Class 1..* works for 0..1 Dept a Dept must have at least 1 employee Crow Magnum Employee works for © Ellis Cohen 2001-2008 Dept 77 Reflexive 1:M Relationships © Ellis Cohen 2001-2008 78 Reflexive Entity Attributes Suppose an employee can have a manager, (who is another employee) Employee DON'T DO THIS! empno ename addr mgr This could identify an employee's manager. Perhaps it would hold the empno or ename of the manager. IS THIS OK OR A BAD IDEA? © Ellis Cohen 2001-2008 79 Entity Attributes & Relationships Employee DON'T DO THIS! works for empno ename addr deptno Dept deptno dname What's wrong with adding deptno as an attribute of Employee? © Ellis Cohen 2001-2008 80 Attributes vs Relationships Employee DON'T DO THIS! empno ename addr mgr mgr would be an entity attribute – an attribute whose value identifies some other entity – in this case, some other employee But entity attributes reflect relationships – e.g. an employee is related (by the manages relationship) to the employee who is their manager. The conceptual model is meant to show relationships. Replace entity attributes by relationships! © Ellis Cohen 2001-2008 81 Reflexive Relationships Visual Conceptual Model (Crow Magnum) An employee may manage other employees Employee empno ename addr manages Textual Conceptual Model (Brief ConText) Employee( empno, ename, addr ) This does not imply that a manager can manage themselves (which would probably be a bad idea!) It doesn’t disallow it either. We'll see how to do that later. Employee manages (*) Employee © Ellis Cohen 2001-2008 82 Reflexive Relationships & Instance Hierarchies 7839 7698 BLAKE KING 5000 2850 7566 7654 7844 TURNER 1500 MARTIN 0 1250 7499 JONES 2975 1400 ALLEN 1600 300 Reflexive relationships commonly describe entity hierarchies: KING manages BLAKE & JONES BLAKE manages MARTIN & TURNER JONES manages ALLEN, etc. © Ellis Cohen 2001-2008 83 Complete Conceptual Model with Attributes Employee works for empno ename addr Dept deptno dname manages Sometimes, designers draw a more detailed diagram that includes the attributes of an entity class (sometimes JUST the primary key attributes) © Ellis Cohen 2001-2008 84 Visual/Textual Conceptual Model Visual Conceptual Model (Crow Magnum) Employee empno manages works for Dept Note: here we chose to just show the primary keys deptno Textual Conceptual Model (Brief ConText) Entity Classes Employee( empno, ename, addr ) Dept( deptno, dname ) Note: MUST list all conceptual attributes here Relationships WorksFor: (*) Employee works for Dept Manages: Employee manages (*) Employee © Ellis Cohen 2001-2008 85 Class Identification & Surrogate Keys © Ellis Cohen 2001-2008 86 Entity Class Identification Exercise A clothing manufacturer identifies each style of item they make by their own unique stylecode. Items within the same style vary by size and color, and each such item is given an itemsku. Each style of item has a category identified by its catid. Create a conceptual model (UML or Easy Crow Magnum ER Diagram) for the manufacturer's database, which includes the following attributes itemsku stylecode stylenam styledate catid catnam size color e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. 'FX311B-24M' '302' 'Hunting Bikini Brief' 1992 (when introduced) 'MU' 'Mens Underwear' 9 'red' © Ellis Cohen 2001-2008 87 Answer to Entity Class Identification Exercise VISUAL Conceptual Model (Easy Crow Magnum) Item itemsku size color has Style has Category stylecode stylenam styledate catid catnam TEXTUAL Conceptual Model (Brief ConText Entity Classes) Item( itemsku, size, color ) Style( stylecode, stylenam, styledate ) Category( catid, catnam ) ItemStyle: (*) Item has Style StyleCategory: (*) Style has Category © Ellis Cohen 2001-2008 88 Surrogate Primary Key Item itemsku size color Item vs itemid itemsku size color Surrogate primary key: A new attribute added to an entity class or relation and used in place of the original primary key. Why would you add a surrogate key? Note: We often wait to add surrogate keys until we map the conceptual model to a relational model © Ellis Cohen 2001-2008 89 Simple Candidate Keys A simple candidate key is any attribute which uniquely identifies a tuple Item itemid itemsku size color Simple Candidate Keys Both itemid and itemsku uniquely identify an item A designer chooses a primary key from one of the candidate keys © Ellis Cohen 2001-2008 90 Redundancy & Anomalies © Ellis Cohen 2001-2008 91 Normalization Problem In designing a conceptual model for employees with the attributes empno, ename, deptno, dname, addr What's wrong with just using a single entity class: Employee( empno, ename, deptno, dname, addr )? © Ellis Cohen 2001-2008 92 Answer: What's Wrong … Entity Class Principles: substance, extensibility, multiplicity, association argue that there should be a Dept class Redundancy: deptno & dname Extra Work: If changed name of a department, would have to do it in multiple places Anomalies: Could change deptno without changing dname or vice versa. © Ellis Cohen 2001-2008 93 Redundancy: deptno dname Employee empno deptno dname 30 SALES 7698 30 SALES 7839 10 ACCOUNTING 7844 30 SALES 7986 50 SUPPORT 7654 … … • Entities with the same value for deptno have the same value for dname • Including dname in the entity class is redundant, since it can be derived from deptno Redundancy causes duplicate work Suppose the company wants to change deptno 30 to be the Sales & Marketing department. That change must be made to multiple employees © Ellis Cohen 2001-2008 94 Redundancy and Anomaly Redundancy can cause anomalies (inconsistencies) if modifications are not done carefully • Update Anomaly: – Updating a value in a single cell can make the database inconsistent • Insertion Anomaly: – Adding an entity can make the database inconsistent • Deletion Anomaly: – Deleting some information can make the database inconsistent or cause unintended loss of information © Ellis Cohen 2001-2008 95 Anomaly Examples Employee empno deptno dname 30 SALES 7698 30 SALES 7839 10 ACCOUNTING 7844 30 SALES 7986 50 SUPPORT 7654 … … Modification Anomaly: Modify 7654's dname to 'SUPPORT' (without changing its deptno) Insert Anomaly: Insert a new employee with a deptno of 20, and a dname of 'SUPPORT' Delete Anomaly: Delete employee 7986 (it’s the only employee in SUPPORT, and no other entity class keeps track that dept 50 is SUPPORT) © Ellis Cohen 2001-2008 96 Simple Functional Dependencies © Ellis Cohen 2001-2008 97 Redundancy and Functional Dependencies Functional Dependencies • Specify which attributes in a entity class are determined by other attributes • Identify potential redundancies • Help us see how to eliminate those redundancies (generating the conceptual model we really should have produced initially!) © Ellis Cohen 2001-2008 98 Functional Dependencies (FD's) Dependencies among attributes AB A functionally determines B B functionally depends on A The value of A uniquely determines a single value for B If two or more entities (of a specific entity class) have the same value for A, they have the same value for B (e.g. Every employee that has the same value for deptno – e.g. 30 has the same value for dname – e.g. SALES) © Ellis Cohen 2001-2008 99 FD's for a Normalized Example Employee( empno, ename, addr ) empno ename empno addr empno can be used to lookup (and therefore uniquely determine) all the other attributes of an Employee tuple This can also be written as empno ename, addr or empno { ename, addr } Also empno empno (this is a trivial FD, which we usually don't write) © Ellis Cohen 2001-2008 100 Determinants & Dependents empno Determinant addr Dependent © Ellis Cohen 2001-2008 101 FD's for an Example with Redundancy Employee( empno, ename, deptno, dname, addr ) empno empno empno empno ename deptno dname addr However, also This is a problem! deptno is NOT a candidate key It indicates redundancy! deptno dname (possibly) dname deptno © Ellis Cohen 2001-2008 102 Redundancy: deptno dname Employee empno deptno dname 30 SALES 7698 30 SALES 7839 10 ACCOUNTING 7844 30 SALES 7986 50 SUPPORT 7654 … … Because deptno is not a candidate key, the same deptno value (e.g. 30) can appear multiple times. But deptno dname, that is, two tuples with the same value of deptno have the same value of dname Voila! REDUNDANCY! © Ellis Cohen 2001-2008 103 Simple FD Exercise Assume you have (foolishly) designed a single Item entity class containing all of the following attributes itemsku stylecode stylenam styledate catid catnam size color e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. 'FX311B-24M' '302' 'Hunting Bikini Brief' 1992 (when introduced) 'MU' 'Mens Underwear' 9 'red' Find all the simple FD's whose determinant is not a candidate key © Ellis Cohen 2001-2008 104 Non-Candidate-Key FD's catid catnam stylecode stylenam stylecode styledate stylecode catid stylecode catnam • follows by transitive inference, because stylecode catid & catid catnam © Ellis Cohen 2001-2008 105 Simple Conceptual Normalization © Ellis Cohen 2001-2008 106 Simple Conceptual Normalization Employee( empno, ename, deptno, dname, addr ) deptno dname Given an entity class with a (non-trivial) functional dependency whose determinant is NOT a candidate key – Split out a new entity class – Make the determinant the primary key (or at least a candidate key) of the new class – Move all attributes that depend on it Employee( empno, ename, addr ) Dept( deptno, dname ) Note: Most books only discuss Normalization at the Relational Design level. However, Conceptual Normalization, though not complete, is a way to improve a conceptual design. We'll examine Normalization in much more detail later in the term © Ellis Cohen 2001-2008 107 Result of Simple Conceptual Normalization Each simple conceptual normalization step – adds one entity class – adds one 1:M relationship link Employee empno ename deptno dname addr Employee has empno ename addr Dept deptno dname deptno dname © Ellis Cohen 2001-2008 108 Conceptual Normalization Exercise Assume you have designed an Item entity class with the following attributes itemsku stylecode stylenam styledate catid catnam size color e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. 'FX311B-24M' '302' 'Hunting Bikini Brief' 1992 (when introduced) 'MU' 'Mens Underwear' 9 'red' Find an FD with a non-candidate-key determinant, and use Conceptual Normalization to split out a new entity class. Continue doing this with all of the resulting entity classes until each of them are in Conceptual Normal Form. © Ellis Cohen 2001-2008 109 ER Decomposition (a) Item itemsku Step stylecode 1 stylenam styledate stylecode catid stylenam, catnam styledate, size catid color catnam has Item itemsku size color has Item Style itemsku size color Style stylecode stylenam styledate catid catnam Step 2 catid catnam has stylecode stylenam styledate Category catid catnam Each simple conceptual normalization step • adds one entity class • adds one relationship link © Ellis Cohen 2001-2008 110 ER Decomposition (b) Item itemsku stylecode stylenam styledate catid catnam size color Item itemsku size color has Item itemsku stylecode stylenam styledate size color Step 1 catid catnam has Style has Category catid catnam Step 2 stylecode stylenam, styledate Category stylecode stylenam styledate catid catnam Each simple conceptual normalization step • adds one entity class • adds one relationship link © Ellis Cohen 2001-2008 111 Database Design Process Requirements Conceptual Design & Conceptual Normalization Conceptual Model Relational Mapping & Relational Normalization Relational Model Physical Mapping Physical Model using DDL & DCL © Ellis Cohen 2001-2008 112