CSUN Information Systems Systems Analysis & Design http://www.csun.edu/~dn58412/IS431/IS431_SP16.htm Data Modeling IS 431: Lecture 5 1 Data Modeling Elements of Entity Relationship Diagram (ERD) Relational Data Model IS 431 : Lecture 5 2 Data Modeling Data Modeling (database modeling, information modeling) is a technique for organizing and documenting a system’s data in a model. Entity Relationship Diagram (ERD) depicts data in terms of the entities and relationships described by the data. IS 431 : Lecture 5 3 From Data Model to DB Implementation ERD: a conceptual model of data entities (things of interest) , their attributes (characteristics of interest), and their relationships (with other things) in an information system (technical independent).(Analysis) Relational Data Model: a blueprint for implementation of a conceptual data model (ERD) in relational database environment (software independent)(Design) MS Access Relationship Window: a graph showing how a data model is implemented with Microsoft Access (a specific DBMS software)(Implementation) IS 431 : Lecture 5 4 Entity-Relationship Diagrams Database = data + relationship ERD is used to model data and their relationship ERD is a graphical representation of a conceptual data model ERD is resource independent: it does not commit to any particular database environment. IS 431 : Lecture 5 5 Entities Entity is a group of attributes corresponding to the same conceptual object about which we need to capture and store data – objects, persons, places, events concepts whose existence is not dependent on many other entities Entity is a set of occurrences (instances) of the object that it represents Entity must have a unique name (a singular noun), unique identifier, and at least one attribute (the identifier itself) IS 431 : Lecture 5 6 Entities: Examples Persons: agency, contractor, customer, department, division, employee, instructor, student, supplier. Places: sales region, building, room, branch office, campus. Objects: book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle. Events: application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip. Concepts: account, block of time, bond, course, fund, qualification, stock. IS 431 : Lecture 5 7 Entity Instance: Example Entity instance – a single occurrence of an entity. Attributes Entity Student ID Last Name First Name Instances 2144 Arnold Betty 3122 Taylor John 3843 Simmons Lisa 9844 Macy Bill 2837 Leath Heather 2293 Wrench Tim IS 431 : Lecture 5 8 Entities: Attributes An attribute is a descriptive property or characteristic of interest of an entity. Also called element, property, and field. The data type for an attribute defines what type of data can be stored in that attribute. The domain of an attribute defines what values an attribute can legitimately take on. The default value for an attribute is the value that will be recorded if not specified by the user. IS 431 : Lecture 5 9 Data Type Representative Logical Data Types for Attributes Logical Data Type Logical Business Meaning NUMBER Any number, real or integer. TEXT A string of characters, inclusive of numbers. When numbers are included in a TEXT attribute, it means that we do not expect to perform arithmetic or comparisons with those numbers. MEMO Same as TEXT but of an indeterminate size. Some business systems require the ability to attach potentially lengthy notes to a give database record. DATE Any date in any format. TIME Any time in any format. YES/NO An attribute that can assume only one of these two values. VALUE SET A finite set of values. In most cases, a coding scheme would be established (e.g., FR=Freshman, SO=Sophomore, JR=Junior, SR=Senior). IMAGE Any picture or image. IS 431 : Lecture 5 10 Data Domains Representative Logical Domains for Logical Data Types Data Type Domain Examples NUMBER For integers, specify the range. For real numbers, specify the range and precision. {10-99} {1.000-799.999} TEXT Maximum size of attribute. Actual values are usually infinite; however, users may specify certain narrative restrictions. Text(30) DATE Variation on the MMDDYYYY format. MMDDYYYY MMYYYY TIME For AM/PM times: HHMMT For military (24-hour times): HHMM HHMMT HHMM YES/NO {YES, NO} {YES, NO} {ON, OFF} VALUE SET {value#1, value#2,…value#n} {table of codes and meanings} {M=Male F=Female} IS 431 : Lecture 5 11 Entities: Identification A key is an attribute, or a group of attributes, that assumes a unique value for each entity instance. A group of attributes that uniquely identifies an instance of an entity is called a composite (or concatenated) key. A candidate key is a “candidate to become the primary key” of instances of an entity. (StudentID, SSN, DriverLicenseNo) A primary key (identifier) is that candidate key that will most commonly be used to uniquely identify a single entity instance. (StudentID) Any candidate key that is not selected to become the primary key is called an alternate key. (SSN, DriverLicenseNo) A secondary key is an attribute whose values divide all entity instances into useful subgroups/sub-criteria. (Major, Gender, etc) IS 431 : Lecture 5 12 Entities ... ENTITY NAME CUSTOMER - entity id - attribute 1 - attribute 2 - ………….. - attribute n - Customer_ID - Cust_Name - Cust_Address - Cust_Phone IS 431 : Lecture 5 13 Relationships A Relationship documents an association between one, two, or more entities It must have a name (and may carry data) Degree of Relationship (number of entities) Cardinalities of Relationship (number of instances/members) IS 431 : Lecture 5 14 Relationships: Degree Degree of Relationship defines how many entities are involved in a relationship: – – – – Recursive (Unary) Binary Ternary …. IS 431 : Lecture 5 15 Relationships: Cardinalities Cardinalities document how many occurrences/members of one entity can relate to a single occurrence/member of another entity in a relationship. Max / Min number of occurrences Reflect business policies or general business practices (e.g., how many classes a student can take, how many students a class can hold). (15, 35) Student (1, 5) Take IS 431 : Lecture 5 Class 16 Cardinalities ... One-to-One (1:1) Relationship Sales 1 Pay 1 Cash Collections M Cash Collections Ex: Cash Sales One-to-Many (1:M) Relationship Sales 1 Pay Ex: Installment Payments IS 431 : Lecture 5 17 Cardinalities ... Many-to-One (M:1) Relationship Sales M Pay 1 Cash Collections Ex: Pay many credit purchases in full Many-to-Many (M:N) Relationship Sales M Pay N Cash Collections Ex:Pay credit purchases with partial payments over some months IS 431 : Lecture 5 18 Cardinalities ... Maximum / Minimum Cardinality Employee (0,E) Possess (1,S) Skill An employee must have minimum one skill A particular skill may not be possessed by any employee (1,E) Employee (0,P) Belong Project A project must have minimum one employee A particular employee may not belong to any project IS 431 : Lecture 5 19 Relationships ... Recursive relationships involve only one entity (occurrences in the same entity) N N TOWN ENTITY - entityID - attribute 1 - attribute 2 M Relationship Name Attrib 1, Attrib 2 TRAVEL Distance -TownID -Town_Name M IS 431 : Lecture 5 20 Relationships ... Binary relationship ENTITY 1 -entity1_ID -attribute 11 -attribute 12 ENTITY 2 M Relationship Name Attrib 1, Attrib 2 N EMPLOYEE PROJECT M - Emp_ID - Emp_Name - Emp_Title -entity2_ID -attribute 21 -attribute 22 Manage Date IS 431 : Lecture 5 N - Project_ID - Proj_Name - Proj_Due 21 Relationships ... Ternary relationship EMPLOYEE M - EmpID - Emp_Name - Emp_Title Assign Date N PROJECT - ProjectID - Proj_Name - Proj_Due P TASK - TaskID - TaskName IS 431 : Lecture 5 22 Logical Data Modeling Stages 1. Context Data model – To establish project scope 2. Key-base data model – – – – Eliminate nonspecific relationships Add associative entities Include primary and alternate keys Precise cardinalities 3. Fully attributed data model – – All remaining attributes Subsetting criteria 4. Normalized data model IS 431 : Lecture 5 23 What is a Good Data Model? A good data model is simple. – Data attributes that describe any given entity should describe only that entity. – Each attribute of an entity instance can have only one value. A good data model is essentially non redundant. – Each data attribute, other than foreign keys, describes at most one entity. – Look for the same attribute recorded more than once under different names. A good data model should be flexible and adaptable to future needs. IS 431 : Lecture 5 24 Building ERD Identify entities of interest (Use REAL framework – Resources/Event/Agents/Locations – more in Lecture 5B) – (Top Down) Identify all the attributes with sufficient details (context specific) Assign attributes to entities – (Bottom Up) Identify degrees of relationships between entities (context specific) Complete the relationships with cardinalities (context specific) Build the model IS 431 : Lecture 5 25 Rules in ERD Building Each entity must have a name Each entity must have an identifier An occurrence itself cannot be an entity (a constant or a table with one unique record?) Each relationship must have a name (may or may not carry data) Reasonable cardinalities (context specific) IS 431 : Lecture 5 26 Alternative ERD Notations Attribute 1 Attribute 3 Attribute 4 Attribute 1 Attribute 2 relates to Entity 1 Entity 2 is related to Attribute 5 IS 431 : Lecture 5 Attribute 2 Attribute 3 Attribute 4 27 Alternative ERD Notations Entity 1 0 Entity Entity Entity 1 1 Entity Entity Entity M 0 Entity Entity Entity M 1 Entity Entity (0, 1) (1, 1) (0 , *) (1 , *) IS 431 : Lecture 5 28 Associative Entity •An associative entity is an entity that inherits its primary key from more than one other entity (called parents). •Each part of that composite (concatenated) key points to one and only one instance of each of the connecting entities. •Represent an M:N relationship carrying data EMPLOYEE PROJECT 1 - Emp_ID - Emp_Name - Emp_Title M ASSIGN Date IS 431 : Lecture 5 N 1 - Project_ID - Proj_Name - Proj_Due 29 M:N Relationship vs. Associative Entity EMPLOYEE PROJECT M - Emp_ID - Emp_Name - Emp_Title EMPLOYEE - Emp_ID - Emp_Name - Emp_Title 1 ASSIGN Date ASSIGNMENT M N - Emp_ID - Proj_ID - Date IS 431 : Lecture 5 N - Project_ID - Proj_Name - Proj_Due PROJECT 1 - Project_ID - Proj_Name - Proj_Due 30 Foreign Keys in Relational Database A foreign key (FK) in Entity E1(CustID in ORDER) is a primary key of another Entity E2 (CustID in CUSTOMER), which is used to identify (link) a 1:M relationship between E1 and E2 (CUSTOMER and ORDER). Foreign key is made on the many side (CUSTOMER has many ORDERS, therefore ORDER carries CustID as FK to show which Customer places that Order) IS 431 : Lecture 5 31 Foreign Key CUSTOMER CUSTOMER CustomerID ORDER ORDER OrderID CustomerID IS 431 : Lecture 5 1:M Relationship Primary Key Foreign Key 32 Foreign Keys in Relational Database. . . In M:N relationship, the associative/junction table with a composite key will be used to capture the relationship. – ORDER involved many PRODUCTS, PRODUCT involved in many ORDERS. Composite key ProductID-OrderID for LINE ITEM to indicate which product involves in which sales Each part of the composite key serves like a foreign key. Sometimes, a “surrogate” key (RecordNo) is used as primary key to simplify the identification of record. IS 431 : Lecture 5 33 Composite Key ORDER PRODUCT ORDER OrderID PRODUCT ProductID M:N Relationship Primary Key LINE_ITEM RecordNo OrderID ProductID Composite Key JUNCTION TABLE IS 431 : Lecture 5 34 ERD Case Studies • ERD 1: Group Consultants Database • ERD 2: Video Rental Database • ERD 3: Health Club Database • ERD 4: Music Store Database to be discussed in class IS 431 : Lecture 5 35 Convert ERD to Relational Data Model Legend – – – – E1, E2, E3 , … represent entities a1, a2, a3, …, b1, b2, b3, … represent entity attributes R represents a relationship r represents one or many attributes that can be carried by a relationship IS 431 : Lecture 5 36 Convert ERD to Relational Data Model … Pattern #1 E1 - a1 - a2 E1( a1, a2 ) TASK - TaskID - TaskName TASK( TaskID , TaskName ) IS 431 : Lecture 5 37 Convert ERD to Relational Data Model … Pattern #2 E1 - a1 - a2 1 E1( a1, a2 ) R r R( a1, a1’, r) 1 1 INDIVIDUAL - SSN - Name MARRY Date 1 INDIVIDUAL ( SSN , Name) MARRY ( SSN , SSN’, Date) Husband IS 431 : Lecture 5 Wife 38 Convert ERD to Relational Data Model … Pattern #3 E1 - a1 - a2 1 E1( a1, a2 ) R R( a1’, a1 ) M 1 INDIVIDUAL - SSN - Name LEAD M INDIVIDUAL ( SSN , Name) LEAD ( SSN’ , SSN ) Member IS 431 : Lecture 5 Leader 39 Convert ERD to Relational Data Model … Pattern #4 E1 - a1 - a2 TOWN - TownID - Name M R E1( a1, a2 ) r R( a1, a1’, r) N M Tour Date N TOWN ( TownID , Name) TOUR ( TownID, TownID’ , Date ) IS 431 : Lecture 5 40 Convert ERD to Relational Data Model … Pattern #5 E1 - a1 - a2 1 EMPLOYEE 1 - EmpID - EmpName R 1 MANAGE E2 - b1 - b2 - b3 E1( a1, a2 ) E2( b1, b2 , b3 , a1) OR E1( a1, a2 , b1 ) E2( b1, b2 , b3 ) PROJECT 1 - ProID - ProName - ProDue EMPLOYEE(EmpID, EmpName) PROJECT(ProID, ProName, ProDue, EmpID) OR EMPLOYEE(EmpID, EmpName, ProID) PROJECT(ProID, ProName, ProDue ) IS 431 : Lecture 5 41 Convert ERD to Relational Data Model … Pattern #6 E1 - a1 - a2 1 CUSTOMER 1 - CustID - CustName R M INVOLVE E2 - b1 - b2 - b3 E1( a1, a2 ) E2( b1, b2 , b3 , a1) PURCHASE M - PurchID - Item - Quantity CUSTOMER (CustID, CustName) PURCHASE(PurchID, Item, Quantity, CustID) IS 431 : Lecture 5 42 Convert ERD to Relational Data Model … Pattern #7 E1 - a1 - a2 M CUSTOMER M - CustID - CustName R r SALE Date N E2 - b1 - b2 - b3 E1( a1, a2 ) E2( b1, b2 , b3) R( a1, b1, r) SALESREP N - RepID - RepName CUSTOMER (CustID, CustName) SALESREP (RepID, RepName) SALE(CustID, RepID, Date) IS 431 : Lecture 5 43 Convert ERD to Relational Data Model … Pattern #8 E1 - a1 - a2 1 R r N E2 - b1 - b2 - b3 P E3 - c1 - c2 - c3 IS 431 : Lecture 5 E1( a1, a2 ) E2( b1, b2 , b3) E3( c1, c2 , c3 ) R( b1, c1 , a1, r) 44 Convert ERD to Relational Data Model … Pattern #8 (example) EXECUTIVE 1 - ExecID - ExecName RESPONSIBLE AppointDate DEPT N - DeptID - DeptName P PROJECT - ProjectID EXECUTIVE( ExecID, ExecName) DEPT( DeptID, DeptName) PROJECT (ProjectID) RESPONSIBLE (DeptID, ProjectID, ExecID, AppointDate ) IS 431 : Lecture 5 45 Convert ERD to Relational Data Model … Pattern #9 E1 - a1 - a2 E2 M R r N - b1 - b2 - b3 P E3 - c1 - c2 - c3 IS 431 : Lecture 5 E1( a1, a2 ) E2( b1, b2 , b3) E3( c1, c2 , c3 ) R(a1, b1, c1 , r) 46 Convert ERD to Relational Data Model … Pattern #9 (example) EMPLOYEE M - EmpID - EmpName ASSIGN TASK N - TaskID Date P PROJECT - ProjectID EMPLOYEE( EmpID, EmpName ) TASK( TaskID) PROJECT( ProjectID) ASSIGN(EmpID, TaskID, ProjectID, Date) IS 431 : Lecture 5 47 Generalization • Generalization is a technique wherein the attributes that are common to several types of an entity are grouped into their own entity, called a supertype. An entity supertype is an entity whose instances store attributes that are common to one or more entity subtypes. An entity subtype is an entity whose instances inherit some common attributes from an entity supertype and then add other attributes that are unique to an instance of the subtype. • • IS 431 : Lecture 5 48 Type & Subtypes Employee Name DOB SSN DateHire Executive Position Location DateAppoint Salary Foreperson Factory Station DateAppoint Worker Rank Wage Rate Wage Rate IS 431 : Lecture 5 49 Generalization Hierarchy IS 431 : Lecture 5 50 Data Analysis & Normalization Data analysis is a process that prepares a data model for implementation as a simple, non-redundant, flexible, and adaptable database. The specific technique is called normalization. Normalization is a technique to organize data attributes such that they are grouped to form nonredundant, stable, flexible, and adaptive entities. IS 431 : Lecture 5 51 Normalization (in Latin) First normal form (1NF) – an entity whose attributes have no more than one value for a single instance of that entity – Any attributes that can have multiple values actually describe a separate entity, possibly an entity and relationship. Second normal form (2NF) – an entity whose nonprimary-key attributes are dependent on the full primary key. – Any nonkey attributes that are dependent on only part of the primary key should be moved to any entity where that partial key is actually the full key. This may require creating a new entity and relationship on the model. Third normal form (3NF) – an entity whose nonprimary-key attributes are not dependent on any other non-primary key attributes. – Any nonkey attributes that are dependent on other nonkey attributes must be moved or deleted. Again, new entities and relationships may have to be added to the data model. IS 431 : Lecture 5 52 Normalization (in Plain English !!!) First normal form (1NF) : – No repeating group of a same attribute (multi-valued attribute) – If not: create a new record / entity for this group. Second normal form (2NF) – Attributes should depend on the whole (composite) key, not part of it (partial functional dependency). – If not: create a new entity for these partial depended attributes Third normal form (3NF) – Attributes should depend on the (primary) key only, not on each other (transitive dependency) – If not: create new entity for these partial depended attributes IS 431 : Lecture 5 53 Un-normalized Relation Observation: Repeating groups / multi-value attributes !!! IS 431 : Lecture 5 54 Relation in 1NF Observation: Attributes still depend on a part of the key !!! IS 431 : Lecture 5 55 Relations in 2NF Observation: Attributes still depend on a non-key attribute !!! IS 431 : Lecture 5 56 Relations in 3NF Observation: Now each table stores data about one thing only. IS 431 : Lecture 5 57 Example of Relational Database IS 431 : Lecture 5 58 Example of Relational Database . . . IS 431 : Lecture 5 59 Data Dictionary EMPLOYEE Attributes Type Size Description Authorization EmpID Numeric 6 Identifier HR Manager EmpFirstName Text 10 Employee First Name HR Manager EmpLastName Text 10 Employee Last Name HR Manager Address Text 50 Employee Address HR Manager City Text 10 Employee City HR Manager State Text 2 Employee Last Name HR Manager Zip Text XXXXX Employee Last Name HR Manager Phone Text XXX-XXX-XXXX Employee Last Name HR Manager Date Hired Date MM/DD/YY Date Hired Employee HR Manager Position Text 15 Position of Employee HR Manager Attributes Types Size Description Authorization EntryNumber Numeric 6 Identifier Project Manager EntryDate Date MM/DD/YY Date of Entry Project Manager HoursWorked Numeric 3 Hours on Task Project Manager HotelExpense Currency 4 Fund Spent on Hotel Project Manager TravelExpense Currency 4 Fund Spent on Travel Project Manager FoodExpense Currency 4 Fund Spent on Food Project Manager Approved Y/N 1 Approved / Not Yet Project Manager EXPENSES IS 431 : Lecture 5 60 Data to Process Matrix IS 431 : Lecture 4 61 Data-to-Location CRUD Matrix IS 431 : Lecture 5 62