The Relational Model Chapter 3 Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 1 Why Study the Relational Model? • Most widely used model. • Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. • “Legacy systems” in older models • E.G., IBM’s IMS • Recent competitor: object-oriented model • ObjectStore, Versant, Ontos • A synthesis emerging: object-relational model • Informix Universal Server, UniSQL, O2, Oracle, DB2 Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 2 The SQL Query Language • Developed by IBM (system R) in the 1970s • Need for a standard since it is used by many vendors • Standards: • • • • SQL-86 SQL-89 (minor revision) SQL-92 (major revision, current standard) SQL-99 (major extensions) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 3 Summary Relational Data Model I 1. 2. 3. A relation database consists of a set of relations (tables) Each relation consist of a set of attribute/domain pairs Attributes have to be single-valued. Additionally, “null” values are supported 4. The attributes of a relation are unordered 5. Attribute values have to be atomic (cannot be tuples or relations) 6. Relations store sets of tuples (queries return bags of tuples!!); each tuple of a relation stores a value for each attribute of the relation; all tuples stored in a relation must be different 7. Tuples in a relation are unordered Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 4 Summary Relational Data Model II 8. Each relation has one primary key that consists of a set of attributes that uniquely identifies the tuples in a relation). 9. Relationships between object in the real world are represented in the relational data model by “exporting primary keys of the objects that participate in the relationship”. The exported keys are called foreign keys. 10. Multi-values attributes can be presented either by using a separate relation or by representing the object through multiple tuples (one for each value of the multi-valued attribute). 11. Optional attributes (attributes that not necessarily have a value) can be represented using null values; furthermore, they can also be represented by using a separate relation that stores the relation’s primary key attributes and the optional attribute. 5 Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions Creating Relations in SQL • Creates the Students CREATE TABLE Students (sid: CHAR(20), relation. Observe that the name: CHAR(20), type (domain) of each field login: CHAR(10), is specified, and enforced by age: INTEGER, the DBMS whenever tuples gpa: REAL) are added or modified. • As another example, the CREATE TABLE Enrolled Enrolled table holds (sid: CHAR(20), information about courses cid: CHAR(20), that students take. grade: CHAR(2)) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 6 Primary Key Constraints • A set of fields is a candidate key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. This is not true for any subset of the candidate key. • Part 2 false? A superkey. • If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. • E.g., sid is a candidate key for Students. (What about name?) The set {sid, gpa} is a superkey. • Remark: Sometimes people say ‘key’ when the refer to candidate keys. Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 7 Primary and Candidate Keys in SQL • Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key. v v “For a given student and course, CREATE TABLE Enrolled (sid CHAR(20) there is a single grade.” vs. cid CHAR(20), “Students can take only one grade CHAR(2), course, and receive a single grade PRIMARY KEY (sid,cid) ) for that course; further, no two CREATE TABLE Enrolled students in a course receive the (sid CHAR(20) same grade.” cid CHAR(20), Used carelessly, an IC can prevent grade CHAR(2), the storage of database instances PRIMARY KEY (sid), that arise in practice! UNIQUE (cid, grade) ) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 8 Foreign Keys, Referential Integrity • Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. • E.g. sid is a foreign key referring to Students: • Enrolled(sid: string, cid: string, grade: string) • If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. • Can you name a data model w/o referential integrity? • Links in HTML! Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 9 Foreign Keys in SQL • Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) Enrolled sid 53666 53666 53650 53666 cid grade Carnatic101 C Reggae203 B Topology112 A History105 B Students sid 53666 53688 53650 name login Jones jones@cs Smith smith@eecs Smith smith@math Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions age 18 18 19 gpa 3.4 3.2 3.8 10 Enforcing Referential Integrity • Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. • What should be done if an Enrolled tuple with a nonexistent student id is inserted? (Reject it!) • What should be done if a Students tuple is deleted? • • • • Also delete all Enrolled tuples that refer to it. Disallow deletion of a Students tuple that is referred to. Set sid in Enrolled tuples that refer to it to a default sid. (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) • Similar if primary key of Students tuple is updated. Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 11 Referential Integrity in SQL/92 • SQL/92 supports all 4 CREATE TABLE Enrolled options on deletes and (sid CHAR(20), updates. cid CHAR(20), grade CHAR(2), • Default is NO ACTION PRIMARY KEY (sid,cid), (delete/update is rejected) FOREIGN KEY (sid) • CASCADE (also delete REFERENCES Students all tuples that refer to ON DELETE CASCADE deleted tuple) ON UPDATE SET DEFAULT ) • SET NULL / SET DEFAULT (sets foreign key value of referencing tuple) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 12 Graphical Short Notations for Relational Schemas • R(A,B,C), S(D,E) meaning: {A,B} is a primary key for R; D is a primary key for S • S(D,E) meaning: X is a foreign key in T that references attribute D of relation T: T[X] S[D] T(X,Y,Z) Remark: The graphical short notation only specifies relation names, attributes, primary keys, and foreign keys but omits other schema information (such as attribute domains, uniqueness constraints, …) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 13 Relational Integrity Rules • Entity Integrity: Null values are disallowed for primary key attributes of a relation. • Referential Integrity: the value of a foreign key must be either null or there must be a tuple with the foreign key value for the exported reference attribute(s) in the referenced relation. • Stability of the Primary Key: Updates of primary key attributes should be disallowed. • Remark: The referenced attribute of a foreign key is usually the primary key of the relation; however, SQL-92 allows foreign keys to reference any attribute(s) of a relation, even attributes that are not superkeys. Syntax: FOREIGN KEY <attribute-list> REFERENCES <relation><attribute-list> e.g. FOREIGN KEY (hssn,wssn,from) REFERENCES Wedding(husband,wife,from) Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 14