Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College Relational Model • Proposed by E.F. Codd, 1970 • Goals – Data independence (physical vs. conceptual) – Normalization (avoiding redundancy) – Set-oriented data manipulation language (relational algebra) • Examples of Relational DBMS: – DBMS System R (IBM), Ingres (UC Berkeley) – Oracle (all versions), Access, RDB, DB2 … Terminology • A relation (instance) is a table with columns and rows. – (don’t confuse with relationships in ER models) • An attribute is a named column of a table. – (corresponds to simple attribute in ER) • A domain is the set of allowable values for an attribute • A tuple is a row of a table. More Terminology • The degree or arity of a relation is its number of attributes (columns) • The cardinality of a relation is its number of tuples (rows) • A relational database is a collection of normalized relations with distinct names. – We’ll get to normalization later! Example Relation (Phone Book) • • • • Attributes: First, last, dept, email, etc. Domain for Title: {Prof, Inst, Assoc, Asst} Tuple: (Obie,Slotterbeck,CS,Obie,5275,Prof) Degree = 6, Cardinality = 4 First Last Dept Email Phone Title Obie Slotterbeck CS Obie 5275 Prof Irina Lomonosov CS LomonosovI 5002 Inst Ellen Walker CS WalkerEL 5250 Prof CS GuercioA 6048 Asst Angela Guercio Order Doesn’t Matter! • This is the same phone book relation. Email Title First Phone Last Dept LomonosovI Inst Irina 5002 Lomonosov CS Obie Prof Obie 5275 Slotterbeck CS WalkerEL Prof Ellen 5250 Walker CS GuercioA Asst Angela 5048 Guercio CS Sets and Subsets • A set is an unordered collection of unique elements Set S = {1,2,3} {a,b,c} = {a,c,b} “1 is an element of S” • A subset of a set is another set whose elements all come from the original set. {a,b} is a subset of {a,c,b} {1,2,3} is a subset of {1,2,3} {1,2,4} is not a subset of {1,2,3} {} (the empty set) is a subset of every set! Mathematical Relations • Cartesian product: a set of ordered pairs, where each contains one element from each original set {1,2,3} x {a, b} = {(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)} • Relation: any subset of a Cartesian product – R1 = {(1,a),(2,b),(3,a)} – R2 = {(1,a), (1,b), (2,b), (3,b)} – R3 = { } Bigger Relations • First, take the cross product of n sets – If n = 3, get triples; if n=4, get quadruples – For arbitrary n, get “n-tuples” • Now, take a subset of that big cross product – Several n-tuples • Arrange them in rows, so the commas line up – Looks familiar… • Domains of attributes are sets from the cross product. Database Schemas • A schema describes a database, but not the data • Relation schema: set of attribute and domain name pairs (First: char-string) (Last: char-string) (Dept: {CS,Math,Comm,Econ,Mgmt, etc.}) (Phone: 4-digit-string) (email: char-string) (Title: {Inst,Asst,Assoc,Prof}) • Relational Database Schema: set of relational schemas, each with a distinct name and integrity constraints (later) Properties of (Database) Relations • • • • Every relation has a unique name Every attribute has a unique name Attribute’s values are all from its domain Because relations are sets… – No duplicate tuples – Order of tuples doesn’t matter (theoretically) • Every cell contains a single value (singlevalued attributes) Single-valued Attributes • “Oldest-child” is an ok attribute • “Children” is not, because it is likely to be multi-valued • Break up composite attributes into multiple attributes – Name --> First-name, Middle-init, Last-name • Use “relationships” in the ER sense for multivalued attributes. Keys • Superkey: set of attributes that uniquely identifies a tuple. – (The whole tuple is always a superkey - why?) • Candidate Key: a superkey that doesn’t have a proper subset that is also a superkey • Primary Key: the candidate key that is selected by the designer (often generated) • Foreign Key: Set of attributes in one relation that matches the Primary key of another. Properties of a Candidate Key • Uniqueness – The set of attribute values in the candidate key uniquely identifies a tuple • Irreducibility – If one attribute is removed from the candidate key, then uniqueness no longer holds NULL • NULL is considered a member of every domain • NULL means: – Value is currently unknown – Value is not applicable for this entry (e.g. “spouse” of a single person) • Don’t confuse NULL with – Zero – Blank or empty string Relational Schema “Shorthand” • List the relation name first • Attribute names in parentheses following the relation name • Primary key attribute(s) underlined • Example – Branch (BranchNo, street, city, postcode) – Staff (StaffNo, fName, position, sex, DOB, salary, BranchNo) • Arrow from foreign key to its referent Integrity Constraints • Rules that apply to all instances of any database • If these rules are violated, the instance is not a legal database (it doesn’t “make sense”) • Some come from business rules of enterprise – “Students can take only 1 course in 3-week term” • Others from the schema design – “InstructorID cannot be NULL and must match the ID of a listed faculty member” Key Constraint • Specifies which attributes compose a candidate key (or primary key) for the relation • Example: – {first, last} and {email} are both keys of the Phone Book relation • Generally, only the primary key is specified • In practice, an auto-generated id number is used as the key Basic Integrity Constraints – Entity Integrity: No attribute of a primary key can be NULL for any tuple. • This constraint is implicit in the KEY constraint in SQL – Referential Integrity: If a relation has a foreign key, either the foreign key is NULL, or the foreign key matches the primary key of an existing tuple in another relation. • Attribute names do not have to be the same Maintaining Referential Integrity • Adding a new tuple – Add with NULL foreign keys – Add tuple in the other relation (with primary key matching the foreign key) first • Deleting a tuple – Find all tuples in other relations whose foreign keys match this one – Either delete those tuples, or change their foreign keys to NULL • Modifying a tuple – If a foreign key is changed, make sure that the new value points to an existing tuple or is NULL Summary of Constraints • Domain constraint – Each attribute’s values come from its domain • Key constraint – Every relation has a primary key • Entity integrity constraint – No NULL primary key attributes • Referential integrity constraint – Foreign keys are NULL or refer to existing tuples’ primary keys • Semantic constraints – Any additional constraints that must be satisfied based on the client’s needs Views • “Relations” that appear to exist, but actually combine attributes from multiple base (implemented) relations. • When a base table changes, the view changes & vice versa • Useful for – Security: limit which attributes a user sees – Customization – Simplification