The Relational Data Model 1 Overview • Relational model basics • Integrity rules • Rules about referenced rows 2 University Database A relational database uses a series of tables or files called relations 3 University Database in MS Access 4 4 Tables • Relational database is a collection of related tables • Heading: table name and column names –attributes or fields • Body: rows, occurrences of data. Each row is a record STUDENT table StdSSN 123-45-6789 124-56-7890 234-56-7890 StdLastName WELLS NORBERT KENDALL StdMajor IS FIN ACCT StdClass FR JR JR StdGPA 3.00 2.70 3.50 5 5 Alternative Terminology Table-oriented Set-oriented Record-oriented (RDBMS) Table (Formal Terms) Relation Record-type, file Row Tuple Record Column Attribute Field No. of rows Cardinality No. of records No. of columns Degree No. of fields Primary key Primary key Record key 6 6 Data Structure Overview • Structure all data into tables • • • • • • User sees only tables A table is composed of rows and columns Rows and columns are not sorted in any particular order No duplicate rows A row may not contain a repeating group The total collection of values which may occur within a column is the domain of that column • The primary key ensures a row’s uniqueness • A foreign key maintains the relationship between tables 7 Data Entity Modeling Concepts: Entity – a class of persons, places, objects, events, or concepts about which we need to capture and store data. • Named by a singular noun Persons: agency, contractor, customer, department, division, employee, instructor, student, supplier. Places: sales region, building, room, branch office, campus. Objects: book, machine, part, product, raw material, software license, software package, tool, vehicle model, vehicle. Events: application, award, cancellation, class, flight, invoice, order, registration, renewal, requisition, reservation, sale, trip. Concepts: account, block of time, bond, course, fund, qualification, stock. 8-8 8 Relational Model • The columns of information (called attributes) in a table (a relation) relate only to the primary key for each row. The column is equivalent to a field. • The rows (called tuples) are uniquely identified by a primary key and are equivalent to a record • The relationship between relations (files/tables) are kept by foreign keys - a primary key of one relation that appears as an attribute of another relation 9 9 Data Modeling Concepts: Foreign Keys Foreign key – a primary key of an entity that is used in another entity to identify instances of a relationship. • A foreign key is a primary key of one entity that is contributed to (duplicated in) another entity to identify instances of a relationship. • A foreign key always matches the primary key in the another entity • A foreign key may or may not be unique (generally not) • The entity with the foreign key is called the child. • The entity with the matching primary key is called the parent. 8-10 10 Data Modeling Concepts: Parent and Child Entities Parent entity - a data entity that contributes one or more attributes to another entity, called the child. In a one-to-many relationship the parent is the entity on the "one" side. Child entity - a data entity that derives one or more attributes from another entity, called the parent. In a one-to-many relationship the child is the entity on the "many" side. 8-11 11 Data Modeling Concepts: Foreign Keys Primary Key Student ID Last Name First Name Dorm 2144 Arnold Betty Smith 3122 Taylor John Jones 3843 Simmons Lisa Smith 9844 Macy Bill 2837 Leath Heather Smith 2293 Wrench Tim Jones Primary Key 8-12 Dorm Residence Director Smith Andrea Fernandez Jones Daniel Abidjan Foreign Key Duplicated from primary key of Dorm entity (not unique in Student entity) 12 Activity 1) Identify the entities and attributes based upon the excerpt from an interview of the manager of an appliance repair shop described on the next several slides. 2) Create the resulting tables. 13 13 Activity (cont’d) “We’re in a bit of trouble here. This is the third week in a row that we have lost a customer’s TV set. Somehow we’ve got to keep better track of who brought in which appliance, and when. And when the appliance is picked up by the customer. We also need to know which of our technicians worked on which appliance, and how long it took. We are also having problems with repair equipment loss. Some equipment has gone missing and I’m afraid it is due to employee theft. The equipment is numbered but we can never remember where we put it. We need to have each employee check each piece out when he needs it, and check it back in when he’s done with it – and only one piece at a time. That way the last guy to check it out pays for it if it’s missing. As you can see this is a big shop and each technician is assigned to a bench but I can’t keep track of which employee is assigned to which bench. Sometimes I wish we hadn’t gotten so big… 14 Activity (cont’d) I had this idea – we’d stamp each appliance the customer brought in, with a number. Then we’d put it on the shelf, and when a technician needed work he’d go to the shelf and sign it out. That way, we’d know who had it last. If the technician can’t fix it, he’d put it back and sign it back in. But only after he wrote down how long he had worked on it. Well it was a great idea. I had these cards printed up and everybody wrote all the information on the card and it worked great. Then we ran out of cards and the print shop was closed and the system fell apart. Here’s another problem – the other day a customer walked in with a whole pile of appliances – TVs, radios, a microwave, - you name it. We had to write his name and phone number on 15 of those little slips. And along the same lines – there was a customer who came in the same day who got rather agitated when she had to give her name and phone number. She seemed to think we should have remembered it because she was just in a week before. So you can see we have some major problems and we hope you can design a better system for us.” 15 Table Rules Columns (attributes) should be non-decomposable (Date and Time are recognized as exceptions) This “account number” 22-04-6794 General Ledger Sub-ledger Customer number Would be represented in a table as follows: Account General Ledger Sub Ledger Customer Number 22 04 6792 16 Table Rules (continued) The order of columns should be arbitrary These are not valid tables because the columns cannot be rearranged without changing the meaning of the data Employee Sales Employee Number FIGURES SALES FOR THE SALES ORDER PART# DETAIL QTY LAST SALES FIVE SALES MONTHS SALES Order ORDER NUMBER (9 ……. OCCU RRENCES) PART# QTY 17 Table Rules (continued) Note that if the order of columns is not arbitrary, the table will generally have one or more of the following: 1. Implicit rather than explicit data 2. An arbitrary limit to the number of occurrences 3. A maintenance anomaly requiring the rearrangement of data when new data is inserted 4. A retrieval anomaly such that the actual position of values is unpredictable 18 Table Rules (continued) The order of the rows should be arbitrary This is not a valid table because the rows cannot be rearranged without changing the meaning of the data SALARY HISTORY EMPLOYEE NUMBER SALARY AMOUNT 1 175.00 1 150.00 1 140.00 2 200.00 2 195.00 2 180.00 Current (employee 1) previous starting Current (employee 2) previous starting 19 Tables Rules (continued) Each row within a table should be unique This is also an invalid table because the rows may not always be unique EMPLOYEE NUMBER SALARY AMOUNT 1 175.00 1 150.00 1 140.00 2 200.00 2 195.00 2 180.00 3 110.00 3 105.00 3 105.00 Current (employee 1) previous starting Current (employee 2) previous starting Current (employee 3) previous starting 20 Table Terms – Duplicate Values A duplicate value is a value or group of values in a table that exactly equal another value or group of values in the same column(s) Rule: Columns or groups of columns marked ND may not contain duplicate values 21 Table Terms – Null Value A null value is a missing entry in a column. Null means “unknown” or inapplicable and is not the same as blank or zero The equivalence of two null values is indeterminate; that is, two null values are not necessarily equal Rule: Columns marked NN (no nulls ) may not contain null values 22 Table Terms – Derived Attribute • A derived attribute is computed from other attributes • Totals • Age computed from Birth-Date • Typically should not be stored directly • Update of derived attributes creates overhead and puts responsibility on coders to ensure updates to derived facts are performed • May use derived attributes in certain circumstances – use sparingly 23 Overview of Keykey Definitions • Candidate • When there are two or more unique keys in a relation, they are called candidate keys. • One is chosen as the primary key, the others are the alternate keys e.g. Employee ID and SSN • Primary Key • Unique, not null, unlikely to change over time • Foreign Key • If the primary key of the relation A also appears in relation B, then it is the foreign key in relation B • The foreign key in Relation B references the relation A 24 Database Integrity Overview • Domain integrity – Appropriate controls must be designed to ensure that no field takes on an inappropriate value Entity integrity: primary keys Every table should have a primary key No two rows with the same primary key value No null values in a primary key Primary key values should not change over time Ensures entities are traceable 25 Referential Integrity (RI) • A parent table contains a primary key which may be referenced by one or more foreign keys • A dependent table (child table) contains a foreign key(s) • Referential integrity – the assurance that a foreign key value in one table has a matching primary key value in the related table. • No restriction • Delete: cascade • Delete: restrict • Delete: set null • Enforcing RI – may have an adverse impact on performance of an application. * More on this topic later 26 Referential Integrity (RI) Default Rules for Enforcing Referential Integrity Action on Parent Insert new row Action on child Insert always OK Update primary key Disallow if parent row has children Delete row Disallow if parent row has children Insert new row Disallow insert if foreign key in new row does not match a primary key value in the parent table Update foreign key Disallow update if updated foreign key does not match a primary key value in the parent table Delete existing row Delete always OK 27 Relationships Among Tables: Appliance Repair APPLIANCE Appl Appl Number Desc PK, SA NN Serial No Model No In Date NN Out Date Customer Problem No FK, NN APPLIANCE EMPLOYEE Appl Employee In Time Out Time Work Number Number Performed <------------------PK------------------> NN NN FK FK EMPLOYEE Employee Employee Bench Number Name Number PK, SA NN FK EQUIPMENT Equip Number <----------FK Status Code FK STATUS Status Code PK FX UF WR Meaning NN, ND Fixed Unfixable Waiting Repair CUSTOMER Customer Customer Phone Number Name Number PK NN BENCH Bench Number PK LOG (EQUIPMENT EMPLOYEE) Employee Time Time Number Stamp In Stamp PK ------------> FK NN EQUIPMENT Equip Tool Number Type PK Serial No Model No Cost 28 Self-Referencing Relationships (Recursive) • Foreign key that references the same table • Represents relationships among members of the same set • Not common but important when occurring 29 Example: Faculty Table FacSSN 098-76-5432 543-21-0987 654-32-1098 765-43-2109 876-54-3210 987-65-4321 FacFirstName LEONARD VICTORIA LEONARD NICKI CRISTOPHER JULIA FacLastName VINCE EMMANUEL FIBON MACON COLAN MILLS FacRank FacSalary FacSupervisor ASST $35,000 654-32-1098 PROF $120,000 ASSC $70,000 543-21-0987 PROF $65,000 ASST $40,000 654-32-1098 ASSC $75,000 765-43-2109 30 Hierarchical Data Display 543-21-0987 Victoria Emmanual 654-32-1098 Leonard Fibon 098-76-5432 Leonard Vince 876-54-3210 Christopher Colan 31 31 Recursive Relationships • 1:1 Recursion • One person can sponsor another person. Each person is sponsored by no more than one person. • Place key of sponsor in row of person being sponsored. PERSON1 Person PersonSponsored Jones Smith Smith Parks Parks null Myrtle Pines Pines null Referential integrity constraint: PersonSponsored in PERSON1 must exist in Person in PERSON1 32 Recursive Relationship: 1:1 Alternative 1:1 Recursive relationship alternative Referential integrity constraint: PersonSponsoredBy in PERSON2 must exist in Person in PERSON2 Person Jones Smith Parks Myrtle Pines PERSON2 Relation PersonSponsoredBy null Jones Smith null Myrtle 33 1:M Recursion • A customer can make many referrals to new customers. 1:N Recursive relationship. Sample data for the REFERRED_BY relationship Customer Number 100 300 400 CustomerNumber 100 200 300 400 500 600 700 Referred These Customers 200, 400 500 600, 700 CUSTOMER Relation CustomerData …. …. …. …. …. …. …. ReferredBy null 100 null 100 300 400 400 Referential integrity constraint: ReferredBy in CUSTOMER must exist in CustomerNumber in CUSTOMER 34 Doctors may treat other doctors and a doctor may be treated by more than M:M •Recursion one doctor. This requires a new table to define the relationship. (DoctorID would be used in place of names) Provider Jones Parks Smith Abernethy Franklin Name Jones Parks Smith Abernethy O’Leary Franklin Receiver Smith Abernethy Jones Franklin DOCTOR relation Other Attributes ….. …... ….. …… ….. ….. 35 • M:M Recursion (continued) • Doctors may treat other doctors and a doctor may be treated by more than one doctor. This requires a new table to define the relationship. (DoctorID would be used in place of names) TREATMENT-INTERSECTION Referential integrity constraint: Physician in TREATMENTINTERSECTION must exist in Name in DOCTOR Patient in TREATMENTINTERSECTION must exist in Name in DOCTOR Physician Jones Parks Smith Abernethy Parks Franklin Jones Patient Smith Smith Abernethy Jones Franklin Abernethy Abernethy 36 More on M-N Relationships • Rows of each table are related to multiple rows of the other table • Not directly represented in the relational model • Use two 1-M relationships and an associative table 37 37 More on M-N Relationships (cont’d) | | 38 38 Referenced Rows • Referenced row • Foreign keys reference rows in the associated primary key table • Enrollment rows refer to Student and Offering • Actions on referenced rows • Delete a referenced row • Change the primary key of a referenced row • Referential integrity should not be violated 39 39 Possible Actions • Restrict: do not permit action on the referenced row • Cascade: perform action on related rows • Nullify: only valid if foreign keys accept null values • Default: set foreign keys to a default value 40 40 Default Rules for referential Integrity Action on Parent Action on child Insert new row Insert always OK Update primary key Disallow if parent row has children Delete row Disallow if parent row has children Insert new row Disallow insert if foreign key in new row does not match a primary key value in the parent table Update foreign key Disallow update if updated foreign key does not match a primary key value in the parent table Delete existing row Delete always OK 41 Entity Relationships Diagrams 42 Overview • • • • Notation basics Understanding relationships Generalization hierarchies Diagram rules 43 Entity-Relationship Diagrams • An entity is person, place, thing, or event for which data is collected • A relationship is a logical association between entities • A relationship exists between the entities Product and Warehouse because products are stored in warehouses • A logical relationship is based on the nature of the entities involved and the environment in which they exist • The logical relationship between entities in an information system are graphically represented in an entity-relationship diagram (ERD) 44 Building an ER Diagram • Define the entity classes • Define and model the relationships between the entity classes • Identify which attributes belong to which entity classes 45 Entity-Relationship Diagrams ENTITY 1 • The diagram shows the basic format of an ERD for two related entities • Each entity is shown as rectangle IS RELATED TO • Entity rectangles are labeled with singular nouns • The relationships between entities is shown as a line labeled with active verbs ENTITY 2 Basic entity-relationship diagram 46 Entity-Relationship Diagrams DOCTOR • The diagram can be interpreted in two ways: a doctor treats a patient,and a patient is treated by a doctor • Each interpretation is equally valid TREATS • Entity-relationship diagrams do not depict data or information flows • One entity will be above or to the left of the other entity, but this does not imply a superior/inferior relationship or a flow from the first entity to the second entity PATIENT The ERD is interpreted as simple English sentences 47 ERD for One-to-One Relationships (1:1) Driver Faculty Member Drives Delivery Truck Department Is Chairperson of 48 ERD for One-to-Many Relationships (1:M) The relationship between BOOK and EDITION is one-tomany: one book appears in many editions, but each edition is associated with only one book 49 ERD for Many-to-Many Relationships (M:N) The relationship between Student and Course Offering is many-to-many: one student can take many courses, and one course can have many students enrolled in it 50 Many-to-Many Relationships (M:N) (cont’d) An Author can write many Books and a Book can be written by more than one Author 51 Basic Symbols Many developers like to show the FK (CourseNo) in the list of attributes 52 52 Cardinality Notation 53 Associative Entity Types for M-way Relationships 54 More Relationships What is the relationship between Office and Faculty? In which table would you place the FK? What is the relationship between Faculty and Offering? In which table would you place the FK? Is it possible to implement this relationship as it is? More about this later. 55 Associative Entity Type Example 56 Comprehensive Example 57 Completeness Rules • Primary Key Rule: all entity types have a PK (direct, indirect, or inherited) • Naming Rule: all entity types, relationships, and attributes have a name • Cardinality Rule: cardinality is specified in both directions for each relationship • Entity Participation Rule: all entity types participate in an at least one relationship except for entity types in a generalization hierarchy 58 Activity 1) Draw a simple ERD containing the Order, Customer, and Product entities. Show attributes and PKs and FKs. 2) Choose appropriate relationship names using your common knowledge of interactions between customers, products and orders. 3) Extend the ERD to include the Employee entity. 4) Choose appropriate relationship names using your common knowledge of interactions between employees and orders. 5) The next slide describes the attributes for each entity. 59 59 Activity (cont’d) • Customer attributes: CustNo , CustFirstName, CustLastName,, CustStreet, CustCity, CustState, CustZip, CustBal (balance.) • Order attributes: OrdNo, OrdDate, OrdName, OrdStreet, OrdCity, OrdState, OrdZip. • Product attributes: ProdNo, ProdName, ProdQOH, ProdPrice, ProdNextShipDate. • Employee attributes: EmpNo, EmpFirstName, EmpLastName, EmpPhone, EmpEmail, EmpCommRate (commission rate), EmpDeptName. 60 Identification Dependency Weak Entity: an entity that borrows all or part of its primary key from another entity type. The primary key of Room is a combination of BldgID and RoomNo 61 Identification Dependency Rules • Weak entity rule: weak entities (also called Dependent entities)have at least one identifying relationship • Identification dependency cardinality rule: the minimum and maximum cardinality must equal 1 for a weak entity in all identifying relationships 62 ERD Notation for Self-Referencing (Recursive) Relationships 63 Generalization Hierarchies 64 Inheritance • Subtypes inherit attributes of supertypes (direct and indirect) • Allows abbreviation of attribute list • Applies to code (methods) as well as attributes (data) 65 Generalization Constraints 66 Multiple Levels of Generalization Security Symbol SecName LastClose D,C Stock OutShares IssuedShares Bond Rate FaceValue D,C Common PERatio Dividend Preferred CallPrice Arrears 67 Comprehensive Example 68 Completeness Rules • Primary Key Rule: all entity types have a PK (direct, indirect, or inherited) • Naming Rule: all entity types, relationships, and attributes have a name • Cardinality Rule: cardinality is specified in both directions for each relationship • Entity Participation Rule: all entity types participate in an at least one relationship except for entity types in a generalization hierarchy • Generalization Hierarchy Participation Rule: at least one entity type in a generalization hierarchy participates in a relationship 69