Entities Entities are the people, places, things, or events that are of interest for a system that we are planning to build. • Some samples of entities are: • People: student, customer, employee, etc. • Places: resort, city, country, etc. • Things: restaurant, product, invoice, movie, painting, book, building, contract, etc. • Events: registration, election, presentation, earthquake, hurricane, etc. 1 Entities • Exercise 1: Consider your educational institution. Your educational institution needs to keep track of its students. How many student entities does it have? What attributes describe these entities? • Exercise 2: Consider your place of work. The Human Resources department in your company needs to manage information about its employees. How many employee entities does it have? What attributes describe these entities? 2 Entity sets • Entity sets are collections of related entities • Entities are related by their classification: 1. student entities are related by the fact that they are all students 2. invoice entities are related by the fact that they are all invoices 3. car entities are related by the fact that they are all cars 4. a department entity set does not contain invoice entities 5. a product entity set does not contain employee entities 3 Entity sets are named after the entities that belong to the set. It is a common convention that entity set names are singular and that, at least, the first letter is capitalized. For example, 1. An entity set named Student contains student entities. 2. An entity set named Invoice contains invoice entities. 3. An entity set named InvoiceLine contains invoice line entities. 4. An entity set named Product contains product entities. 4 Occasionally we find it useful to use a set diagram to represent an entity set and its entities. John Amelia This student entity set has four entities in it: John, Amelia, Lee, and April Lee April 5 Exercise 3: Consider your educational institution or place of work. What are some of the entity sets that would be useful? Based on your knowledge of the organization, draw some entity sets (similar to the figure on page 5) that illustrate some of the entities you know of. In practice, you may not hear the term entity set used very often. When an analyst or modeler makes reference to an entity set Student, you are more likely to hear them say something like “We need a Student entity” instead of “We need a Student entity set”. 6 We use a rectangular symbol to represent an entity set in the Entity Relationship Diagram. Department Student Course Exercise 4: Consider the previous exercise, but now draw an ERD depicting the entity sets as done in above figure. 7 Weak Entities Sometimes we know certain entities only exist in relationship to others •Consider an invoice and the lines on that invoice •on each line there are details pertaining to the quantity and price for a product •An invoice line is something that will exist only if the corresponding invoice exists •We say in a case like this, that an Invoice Line is a weak entity; it is existence-dependent on another entity. For a weak entity, the primary key cannot be recognized. 8 Invoice Invoice Line An Invoice Line is existence-dependent on an Invoice, and so it is shown with a double-lined rectangle. Exercise 5: Consider a requirement having to do with benefits that may be given to employees of a company. Suppose it is necessary to know something about each employee’s dependents. Illustrate employees and dependents as separate entity sets. Can one of these entity sets be considered a weak entity set? 9 Attributes • Attributes are the characteristics that describe entities • A Student entity may be described by attributes including: - student number - name - address - date of birth - degree major 10 • An Invoice entity may be described by attributes including: - invoice number - invoice date - invoice total 11 We illustrate attributes using ovals. number date Invoice total An entity set, Invoice, with three attributes. 12 A common convention for naming attributes is to use singular nouns. A naming convention may require one of: •All characters are in upper case. •All characters are in lower case. •Only the first character is in upper case. •Each part of a multipart name has the first character Capitalized. 13 A typical convention is for attribute names to have prefix that indicates the entity the attribute describes. Subsequent characters are sufficiently descriptive to identify the attribute. Some examples of attribute names: Lname = last name empLname = employee last name stuGpa = student grade point average ProdCode = product code InvNum = invoice number 14 proPrice proNum proDesc Product specifies lists InvLine InvDate Invoice invNum invAmount invLineNum invLineQty invDate A simple ERD illustrating entities and attributes 15 Attribute Classification • • • • • • • • • Atomic Attributes Composite Attributes Single-valued Attributes Multi-valued Attributes Key Attributes Partial Keys Surrogate Key Non-key Attributes Derived Attributes 16 Atomic Attributes A simple, or atomic, attribute is one that cannot be decomposed into meaningful components Example an attribute for product price, prodPrice, cannot be decomposed, because you cannot subdivide prodPrice into a finer set of meaningful attributes. the value of the attribute prodPrice could be $21.03. Of course, one could decompose prodPrice into two attributes the dollar component (21), and the cents component (03) but our assumption here is that such a decomposition is not meaningful to the intended application or system that will make use of it. 17 Exercise 6: Make a list of attributes that would be required for a) a student entity set, b) for an employee. Consider each of your attributes, and verify that each is atomic. Create an ERD that illustrates the entity set and its attributes. 18 Composite Attributes A composite attribute is an attribute that is shown as comprising two or more simpler attributes; we show a composite attribute in the figure below. empLname Employee empFname empName empName shown as a composite attribute 19 Exercise 7: Many entities will need an address attribute. For example, a student has an address, an employee has an address. Show, in an ERD, how the address attribute can be shown as a composite attribute. 20 Single-valued Attributes An attribute is single-valued if, for any instance of the pertinent entity set, there is only one value at a given time for the attribute. Employee empGender Exercise 8: A college or university will keep track of several addresses for a student, but each of these can be named differently: for example, consider that a student has a mailing address and a home address. Create an ER for a student entity set with two composite single-valued attributes for student addresses. 21 Multi-valued Attributes Consider an attribute to track each employee’s university degrees - empDegree. Since an employee could have none, one, or several degrees, we say empDegree is multi-valued. sample data for three employees. empNum empPhone empDegree 123 233-9876 333 233-1231 BA, BSc, PhD 679 233-1231 BSc, MSc One of these employees has no degrees, another has 3 degrees, and the last one has 2 degrees. 22 empDegree is shown as a multi-valued attribute empPhone empNum empDegree Employee The presence of a multi-valued attribute indicates an area that may require more analysis. 23 Exercise 9: Consider the employee entity set. Suppose the company needs to track the names of the dependents for each employee. Show the empDependentName as a multi-valued attribute. Modify your ERD to show empDependentName as a composite attribute comprising first and last names and middle initials. Exercise 10: Create a new ERD that avoids the multi-valued attribute in the figure on page 22. Hint: Consider including another entity set - one for only keeping track of employee information and one for keeping track of each employee's degrees. 24 Key Attributes A key is an attribute, or combination of attributes, where the attribute value identifies an individual entity Suppose an educational institution assigns students individual student numbers in such a way that each student’s number is different from that assigned to any other student Student numbers are unique, each student has a different number. 25 In an ERD, keys are shown underlined: Student stuNum For every entity set, we want to specify which attribute is a key attribute. In this case, Student is shown to have a key named stuNum. 26 Exercise 11: Suppose a company that sells products has identified that the product entity set has the following attributes: prodNum, prodDesc, prodPrice. Suppose all three products are single-valued and that prodNum is a key attribute - each product has a different product number. Illustrate this knowledge regarding products in an ERD. Exercise 12: Suppose a college comprises a number of departments. Suppose also that the college assigns each department a unique number and a unique name. That is, no two departments will have the same number or name. Suppose the college also tracks the name of each department chair and general office number. Illustrate the department entity set and attributes in an ERD. 27 Partial Keys Sometimes we have attributes that distinguish entities of an entity set from other entities in the same set, but only relative to some other related entity. This situation arises naturally when we model things like invoices and invoice lines. - if invoice lines are assigned line numbers (1, 2, 3, etc.), these line numbers distinguish lines on one invoice from one another. - for any given line number value, there could be many invoice lines (from separate invoices). - Invoice line number is a partial key, or discriminator, for invoice lines. 28 Invoice InvoiceLine Some attributes may serve the purpose of discriminating invLineNum between instances of entities related to some other entity invNum Exercise 13: Consider an educational institution that has departments and where each department offers courses. Typically, departments are assigned unique numbers and so DeptNum is a key for the department entity set. However, suppose course numbers are unique within a department, but not across departments. So, History may have a course numbered 215, and English could have a course numbered 215 as well. In order to identify a particular course we need to know the department and we need to know the course number. Illustrate an ER model including department and course entity sets. Include attributes for the department name, course title, and course description. 29 About Foreign Key A foreign key is an attribute associated with an entity, which is used to establish a relationship between two entity sets. (Foreign key is not a concept in ERD, but in the relational model.) Constraint on foreign keys: 1. The domain of a foreign key must be the same as that of the key of another entity. 2. Any value for a foreign key (attribute) must appears in the value set for the key of the related entity. 30 Example manage Department Employee DeptNo. … Essn ssn … 1 2 3 4 … … … … 661 765 768 … 661 765 768 567 … … … … salary 31 Surrogate Key When a key specified for an entity is meaningless to the entity (it doesn’t describe any characteristic of an entity), and when it has a simple numeric value, the key is referred to as a surrogate key. A key that is not a surrogate key is often referred to as a natural key. - One way of determining if your key attribute is a surrogate key or not, is to consider whether the attribute's value is useful to an end-user. - If it is not meaningful to the end-users and if is a simple integer, then it’s called a surrogate key. Exercise 14: Would you consider invoice line number to be a surrogate key? 32 Non-key Attributes Non-key attributes are attributes that are a key most attributes are simply descriptive, and fall into this category. Consider attributes for first name, last name, birth date; usually these attributes are non-key attributes. Consider a name attribute • People may join your organization and arrive with a name; we expect many people in a large organization to have the same first name, same last name, and even the same combination of first and last name. • Names cannot usually be used as a key. • Names that are chosen for entities such as departments in an organisation could be keys because of the way the company would choose department names - they wouldn't give two different departments the same name. 33 Determining key attributes is an important exercise, and one that requires careful consideration. Most attributes are non-key attributes. Exercise 15: Consider an employee entity set and decide which attributes are key attributes and which ones are non-key attributes. Illustrate with an ERD. 34 Derived Attributes If an attribute’s value can be derived from the values of other attributes, then the attribute is derivable, and is said to be a derived attribute Example if we have two attributes for an employee, birth date and current age, then age is derivable by subtracting the birth date from the current date. birthdate Student age The age of the employee is a derived attribute. 35 Properties of Attributes • Must each attribute have a value? • Sometimes you won’t know the value of an attribute until a certain event occurs. Example An Education model has an Enrollment entity set with attributes for the grade awarded to a student in a course, enrGrade, and for the date the student registered, enrDateRegistered. enrGrade Enrollment enrDateRegistered 36 Sample data showing a Null for student 1, course 765 in department 5. StuNum DeptNum CourseNum EnrGrade 1 5 661 A 1 5 765 2 6 765 EnrDateRegistered Jan 1, 2007 Sept 1, 2007 B Jan 1, 2007 37 Domains A domain is a set of values. Examples A college that assigns each student a student number could decide the domain of student numbers is the set of positive 7-digit integers. The total for an invoice, InvTotal, and the price of a product, prodPrice, are both associated with the domain of non-negative currency values. The quantity on an invoice line, InvLineQty, is associated with the domain of non-negative integers. Knowing the underlying domains in your model is important. They help to complete your analysis, they are indispensable for coding specifications, and they are useful for defining meaningful error messages 38