Chapter 3: Relational Model Chapter 3: Relational Model Chapter 3: Relational Model ............................................................................................... 1 1 Objectives ........................................................................................................................ 2 2 Logical View of Data ....................................................................................................... 3 2.1 Table characteristics.................................................................................................. 3 2.2 terms .......................................................................................................................... 3 2.2.1 domain................................................................................................................ 3 2.2.2 primary key ........................................................................................................ 4 2.2.3 abstract data type................................................................................................ 4 2.2.4 tuple.................................................................................................................... 4 2.2.5 attribute .............................................................................................................. 4 3 Keys ................................................................................................................................. 4 3.1 determinant ............................................................................................................... 4 3.1.1 functional dependence ....................................................................................... 4 3.1.2 multi-functional dependence .............................................................................. 5 3.2 types of keys ............................................................................................................. 5 3.2.1 super key ............................................................................................................ 5 3.2.2 candidate key ..................................................................................................... 5 3.2.3 primary key ........................................................................................................ 5 3.2.4 alternate key ....................................................................................................... 5 3.2.5 foreign key ......................................................................................................... 5 3.2.6 secondary key..................................................................................................... 5 3.2.7 composite key .................................................................................................... 6 3.3 entity integrity ........................................................................................................... 6 3.4 referential integrity.................................................................................................... 6 4 DB Integrity ..................................................................................................................... 6 4.1 domain integrity ........................................................................................................ 6 4.2 entity integrity ........................................................................................................... 6 4.3 referential integrity.................................................................................................... 6 4.4 business rules ............................................................................................................ 6 Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5 Relational DB Query Languages ..................................................................................... 6 5.1 relational algebra ....................................................................................................... 6 5.1.1 select .................................................................................................................. 7 5.1.2 project ................................................................................................................ 8 5.1.3 join ..................................................................................................................... 8 5.1.3.1 equa-join ..................................................................................................... 8 5.1.3.2 theta-join ..................................................................................................... 9 5.1.3.3 natural join .................................................................................................. 9 5.1.3.4 outer join ................................................................................................... 10 5.1.4 intersect ............................................................................................................ 11 5.1.5 union ................................................................................................................ 11 5.1.6 difference ......................................................................................................... 11 5.1.7 Cartesian product ............................................................................................. 12 5.1.8 division............................................................................................................. 12 5.2 relational calculus ................................................................................................... 13 6 Meta Data Components.................................................................................................. 13 6.1 data dictionary......................................................................................................... 13 6.2 system catalog ......................................................................................................... 13 6.2.1 synonym ........................................................................................................... 14 6.2.2 homonym ......................................................................................................... 14 7 Relationship Types......................................................................................................... 14 7.1 1:1 ........................................................................................................................... 14 7.2 1:N........................................................................................................................... 15 7.3 M:N ......................................................................................................................... 16 7.3.1 intersection table .............................................................................................. 16 7.3.2 dummy table..................................................................................................... 17 8 Data Redundancy in Relational Model .......................................................................... 17 9 Indexes ........................................................................................................................... 17 9.1 pointer ..................................................................................................................... 18 9.2 unique index ............................................................................................................ 18 9.3 non-unique index .................................................................................................... 18 1 Objectives The relational database model takes a logical view of data. The relational model's basic components are entities, attributes, and relationships among entities. Entities and their attributes are organized into tables. Know relational database operators, the data dictionary, the system catalog. How data redundancy is handled in the relational model. Why is indexing important? Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 2 Logical View of Data Relational model Enables us to view data logically rather than physically Reminds us of simpler file concept of data storage Table Has advantages of structural and data independence Resembles a file from conceptual point of view Easier to understand than its hierarchical and network database predecessors 2.1 Table characteristics Table: two-dimensional structure composed of rows and columns Contains group of related entities (an entity set) Terms entity set and table are often used interchangeably Table also called a relation because the relational model’s creator, Codd, used the term relation as a synonym for table Think of a table as a persistent relation: A relation whose contents can be permanently saved for future use 5 "Rules" of a relational table: 1. tuple and attribute order is immaterial 2. every tuple is unique 3. cells contain single values 4. all values within an attribute come from the same domain 5. relation names within the database and attribute names within the relation are unique These 5 rules fully describe relations in the relational database model. 2.2 terms 2.2.1 domain The set of allowable values that an attribute may take on. Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 2.2.2 primary key Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row) Key’s role is based on determination If you know the value of attribute A, you can look up (determine) the value of attribute B 2.2.3 abstract data type Data types that describe a set of similar objects with shared and encapsulated data representation and methods. An abstract data type is generally used to describe complex objects. Similar to a class in object oriented domain. 2.2.4 tuple A row in a relation. 2.2.5 attribute A column in a relation. 3 Keys Generally speaking, keys consist of one or more attributes that determine other attributes. There are various types of keys, but all share this characteristic. Keys are generally associated with indexes; however, keys and indexes are not the same thing. An index is a small file that uses key information to speed up the lookup process into another file. Keys, on the other hand, are a type of integrity constraint. 3.1 determinant When one value can be used to "determine" another, it is said to be its determinant. 3.1.1 functional dependence "The attribute B is functionally dependent on the attribute A if each value in column A determines one and only one value of column B." For example: if it is true that when you know A, you also know B and C, it is true that A --> B,C (read A determines B and C). Likewise, B and C are functionally dependent on A. Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 3.1.2 multi-functional dependence The attribute B is multi-functionally dependent on the attribute A if each value in column A determines a well defined set of values for column B. 3.2 types of keys 3.2.1 super key Superkey Any key that uniquely identifies each entity 3.2.2 candidate key Candidate key A minimal superkey (one without redundancies) 3.2.3 primary key Primary key A candidate key that is selected as the “prime” key 3.2.4 alternate key Alternate key Candidate keys that are not selected to be the primary key 3.2.5 foreign key Foreign key (FK) An attribute whose values match primary key values in the related table Foreign keys are related to Referential integrity in that it exists when a foreign key points to a valid primary key. 3.2.6 secondary key Secondary key A set of attributes that determine other attributes based upon the values currently held. Values not required to be unique. Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 3.2.7 composite key Composed of more than one attribute. Also known as a "concatenated key" 3.3 entity integrity The formal type of integrity associated with primary keys. Says that all primary keys must be unique in the relation and may not be null. Nor can any part of a composite primary key be null. 3.4 referential integrity A term related to foreign keys. Referential integrity is said to exist when a foreign key has a matching primary key in another relation. 4 DB Integrity 4.1 domain integrity Property that the value of an attribute conforms to the domain defined for the attribute. 4.2 entity integrity Primary keys must be unique and no null values are allowed in any part of a primary key. 4.3 referential integrity Referential integrity is said to exist when a foreign key has a matching primary key in another relation. Said another way, referential integrity means that if the foreign key contains a value, that value refers to an existing valid tuple in another relation. 4.4 business rules Custom integrity rules that are specific to the business. Can be anything. Modern DBMS's are capable of accepting and enforcing these rules. 5 Relational DB Query Languages 5.1 relational algebra A procedural approach to data query language that processes data a "set-at-a-time". Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model Defines theoretical way of manipulating table contents using relational operators: SELECT PROJECT JOIN INTERSECT UNION DIFFERENCE PRODUCT DIVIDE Actually, only select, project, unioin, difference, and product are needed because the rest can be derived from this base set of operators. The most commonly used operators are the select, project, and join. Use of relational algebra operators on existing tables (relations) produces new relations 5.1.1 select Select (restrict) Yields values for all rows found in a table Can be used to list either all row values or it can yield only those row values that match a specified criterion Yields a horizontal subset of a table Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5.1.2 project Project Yields all values for selected attributes Yields a vertical subset of a table 5.1.3 join Join Allows us to combine information from two or more tables Real power behind the relational database, allowing the use of independent tables linked by common attributes 5.1.3.1 equa-join Equijoin Links tables on the basis of an equality condition that compares specified columns of each table Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model Outcome does not eliminate duplicate columns Condition or criterion to join tables must be explicitly defined Takes its name from the equality comparison operator (=) used in the condition 5.1.3.2 theta-join Theta join If any other comparison operator is used 5.1.3.3 natural join Links tables by selecting only rows with common values in their common attribute(s) Result of a three-stage process: PRODUCT of the tables is created SELECT is performed on Step 1 output to yield only the rows for which the AGENT_CODE values are equal Common column(s) are called join column(s) PROJECT is performed on Step 2 results to yield a single copy of each attribute, thereby eliminating duplicate columns Final outcome yields table that Does not include unmatched pairs Provides only copies of matches If no match is made between the table rows, the new table does not include the unmatched row The column on which we made the JOIN—that is, AGENT_CODE—occurs only once in the new table If the same AGENT_CODE were to occur several times in the AGENT table, a customer would be listed for each match Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5.1.3.4 outer join Matched pairs are retained and any unmatched values in other table are left null In outer join for tables CUSTOMER and AGENT, two scenarios are possible: Left outer join Yields all rows in CUSTOMER table, including those that do not have a matching value in the AGENT table Right outer join Yields all rows in AGENT table, including those that do not have matching values in the CUSTOMER table Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5.1.4 intersect Intersect: Yields only the rows that appear in both tables 5.1.5 union Union: Combines all rows from two tables, excluding duplicate rows Tables must have the same attribute characteristics 5.1.6 difference Difference Yields all rows in one table not found in the other table— that is, it subtracts one table from the other Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5.1.7 Cartesian product Product (Cartesian product) Yields all possible pairs of rows from two tables 5.1.8 division DIVIDE requires the use of one single-column table and one twocolumn table Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 5.2 relational calculus A non-procedural query language that processes data a 'set-at-atime'. Superior to relational algebra because it is non-procedural; however, it does not have any more "expressive capability" than relational algebra. That is, it cannot do any more than relational algebra, it is just easier to use. 4 Forms Tuple calculus Domain calculus transform-oriented languages (SQL) QBE 6 Meta Data Components The structural component of the database is called 'metadata.' This component is normally stored in the data dictionary and the system catalog. 6.1 data dictionary Data dictionary Used to provide detailed accounting of all tables found within the user/designer- created database Contains (at least) all the attribute names and characteristics for each table in the system Contains metadata—data about data Sometimes described as “the database designer’s database” because it records the design decisions about tables and their structures 6.2 system catalog System catalog Contains metadata Detailed system data dictionary that describes all objects within the database Terms “system catalog” and “data dictionary” are often used interchangeably Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model Can be queried just like any user/designer- created table 6.2.1 synonym Different names to describe the same attribute. For example, teacher and instructor are likely synonyms in a college setting. 6.2.2 homonym Same attribute name with different meanings. For example, 'phone' in one relation might mean home phone while 'phone' in another could mean work phone (or cell, or fax, ...). Can cause a good deal of confusion. 7 Relationship Types 1:M relationship Relational modeling ideal Should be the norm in any relational database design M:N relationships Must be avoided because they lead to data redundancies 1:1 relationship Should be rare in any relational database design 7.1 1:1 Found in some database environments One entity can be related to only one other entity, and vice versa Often means that entity components were not defined properly Could indicate that two entities actually belong in the same table Sometimes 1:1 relationships are appropriate (e.g., true 1:1 such as dept to supervisor) Reasons you may want to have 1:1 are 1) true 1:1 (see above), 2) single relation with all attributes would be too big to store, 3) performance better if you split it up, 4) subtypesupertype structure. Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 7.2 1:N The most common relation in reality. Most database models are designed to show these by default. Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 7.3 M:N Most data models cannot handle this directly due to the complex nature of the relationship. Can be implemented by breaking it up to produce a set of 1:M relationships Can avoid problems inherent to M:N relationship by creating a composite entity or bridge entity (also known as an intersection table) 7.3.1 intersection table Intersection Table Implementation of a composite entity Yields required M:N to 1:M conversion Composite entity table must contain at least the primary keys of original tables Linking table contains multiple occurrences of the foreign key values Additional attributes may be assigned as needed Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model 7.3.2 dummy table Same as in intersection table, but one that only has key information, no additional attributes included. 8 Data Redundancy in Relational Model Data redundancy leads to data anomalies Such anomalies can destroy database effectiveness Foreign keys - controlled redundancy Control data redundancies by using common attributes shared by tables Crucial to exercising data redundancy control in the relational model Sometimes, data redundancy is necessary (foreign keys and efficiency concerns are two main reasons) 9 Indexes Data structure used to speed up access to rows in a table Index key Index’s reference point (i.e., the primary key) Points to data location identified by the key Unique index Jay M. Lightfoot, Ph.D. Chapter 3: Relational Model Index in which the index key can only have one pointer value (row) associated with it Each index is associated with only one table and a single table can have multiple indexes. Be aware that while indexes speed up access, they slow down update and delete and take up space on the disk drive. 9.1 pointer The address (or indirect link) to the tuple pointed to by the index. Can be absolute (physical address), relative (an offset from a known starting point), or logical (computed using an algorithm). 9.2 unique index An index guaranteed to point to a unique tuple. No duplicates allowed. Required for primary keys. 9.3 non-unique index An index that allows duplicate values. Associated most often with secondary keys. Jay M. Lightfoot, Ph.D.