Chapter 3A Objectives A Logical View of Data • Understand • Relational model – the relational database model offers a logical view of data – the relational model’s basic component: relations – relations are logical constructs composed of rows (tuples) and columns (attributes) – relations are implemented as tables in a relational DBMS – View data logically rather than physically • A Table is a logical construct – Structural and data independence • The data is not physically tied to file formats – Resembles a file conceptually • Relational database model is easier to understand than hierarchical and network models – K.I.S.S. principle CS275 Fall 2010 1 Tables and Their Characteristics CS275 Fall 2010 2 Tables and Their Characteristics • Logical view of relational database is based on relation • Entity Set examples – College DB: students, courses, faculty • – Relation thought of as a table Attributes for Students : StId, StLname, .. – Airline DB: pilots, routes, aircraft • Table: two-dimensional structure composed of rows and columns • Attributes for AirCraft: AC_regNum, AC_milesFlown, … – Persistent representation of logical relation – Contains group of related entities (entity set) • Entity Set, Table, and Relation (E.F. Codd’s term) are used interchangeably CS275 Fall 2010 3 CS275 Fall 2010 4 1 Tables and Their Characteristics Tables and Their Characteristics • Each row is an entity • Each column is an attribute – Attributes are characteristics of an entity – Attribute have naming conventions • Begin with a character, not a number • Give names a prefix that represents the table name • Use the underscore “_” , or camelCase to separate parts of a name • don’t use arithmetic operators like dashes in an name, ie. amount-1 CS275 Fall 2010 CS275 Fall 2010 5 Determination & Primary Keys Determination & Primary Keys • Key’s role is based on determination • Attribute A determines attribute B if all of the rows in the table that agree in value for attribute A also agree in value for attribute B. • It is possible for functional dependence to be determined by a range of values. – If you know the value of attribute A, you can look up (determine) the value of attribute B • one or more attributes that determine other attributes • Primary key (PK) uniquely identifies any given entity (row) • Functional dependence: – Attribute B is functionally dependent on attribute A if each value in A determines one and only one value in B. 7 CS275 Fall 2010 6 – Later designs might compute this attribute, not store the classification. CS275 Fall 2010 8 2 Choosing a Primary Key Null Values • Choosing a Primary Key is a process of finding attributes based on determination and elimination of attributes by the following steps. 1. Key attribute Not permitted in primary key Means ‘no data entered’ Should be avoided in other attributes Is used to represent – An unknown attribute value – A known, but missing, attribute value – A “not applicable” condition • Any attribute that is part of a key 2. Composite key • Composed of more than one attribute 3. Superkey • Can create problems in logic and formulas in minimal DBMS – when functions such as COUNT, AVERAGE, and SUM are used – when relational tables are linked • Any key that uniquely identifies each row 4. Candidate key • A superkey without unnecessary attributes 5. Primary key • The chosen candidate key CS275 Fall 2010 • • • • 9 CS275 Fall 2010 10 Relational Schema for CH02_SaleCO Controlled redundancy Vendor (v_code, v_contact, v_areaCode, v_phone) Product(p_code, p_descript, p_price, p_onhand, v_code) • Controlled redundancy – Makes the relational database work – Tables within the database share common attributes • Enables tables to be linked together, unlike previous DB models which used pointers/links. – Multiple occurrences of values are not redundant when required to make the relationship work – Redundancy exists only when there is unnecessary duplication of attribute values CS275 Fall 2010 11 CS275 Fall 2010 12 3 Relational Diagram for CH02_SaleCO Relational Database Keys • Notice that the vend_code attribute name is the same in both tables. CS275 Fall 2010 13 CS275 Fall 2010 Controlled redundancy & Other Keys 14 Integrity Rules • Foreign key (FK) • Many RDBMs enforce integrity rules automatically • Safer to ensure that application design conforms to entity and referential integrity rules • Some designers use flags to avoid nulls – An attribute whose values matches primary key values in the related table • Referential integrity – FKey contains a value that refers to an existing valid tuple (row) in another relation, or is Null • Secondary key – Flags indicate the absence of some value – Key used strictly for data retrieval purposes – Often a Super Key that doesn’t become a Candidate Key & then Primary Key – Should yield a unique row or few duplicates CS275 Fall 2010 15 CS275 Fall 2010 16 4 Integrity Rules Summary of Integrity Rules • Entity integrity – Ensures all entities in the set are unique – Each entity has unique key • Referential integrity – Foreign key must have null value or match primary key values – Makes it impossible to delete row whose primary key has mandatory matching foreign key values in another table 17 CS275 Fall 2010 CS275 Fall 2010 18 The Relational Schema & Diagram for CH03_SaleCo • Vendor(v_code, v_contact, v_areacode, v_phone) • Product(p_code, p_descript, p_price, p_onhand, v_code) In Customer, Agent_code attributes all have values which match Agent_code values in Agent. 19 CS275 Fall 2010 CS275 Fall 2010 20 5 Data Redundancy Review Summary • Data redundancy leads to data anomalies • Tables are basic building blocks of a relational database • Keys are central to the use of relational tables • Keys define functional dependencies – Such anomalies can destroy database effectiveness • Sometimes, data redundancy is necessary – Foreign keys create relationships between entities. • Control data redundancies by using common attributes shared by tables • Crucial to exercise data redundancy control • Exceptions: – – – – – – Allowing for tracking data historically. • Example: price attribute in product and in line item of invoice, allowing for the product price to be changed. – • Rarely: Speed of Processing Good Design deals with conflicting goals of speed, information requirements, & design elegance. Superkey Candidate key Primary key Secondary key Foreign key 21 CS275 Fall 2010 CS275 Fall 2010 22 Summary • Each table row must have a primary key that uniquely identifies all attributes • Tables are linked by common attributes through controlled redundancy • Foreign key’s are used to link tables • Secondary key’s provide an optional way of retrieving a row or a small subset of rows. CS275 Fall 2010 23 6