Chapter 2 2 The Relational Database Model Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel A Logical View of Data 2 Relational database model’s structural and data independence enables us to view data logically rather than physically. The logical view allows a simpler file concept of data storage. The use of logically independent tables is easier to understand. Logical simplicity yields simpler and more effective database design methodologies. A Logical View of Data Entities and Attributes 2 An entity is a person, place, event, or thing for which we intend to collect data. University -- Students, Faculty Members, Courses Airlines -- Pilots, Aircraft, Routes, Suppliers Each entity has certain characteristics known as attributes. Student -- Student Number, Name, GPA, Date of Enrollment, Data of Birth, Home Address, Phone Number, Major Aircraft -- Aircraft Number, Date of Last Maintenance, Total Hours Flown, Hours Flown since Last Maintenance A Logical View of Data Entities and Attributes 2 A grouping of related entities becomes an entity set. The STUDENT entity set contains all student entities. The FACULTY entity set contains all faculty entities. The AIRCRAFT entity set contains all aircraft entities. A Logical View of Data Tables and Their Characteristics 2 A table contains a group of related entities -- i.e. an entity set. The terms entity set and table are often used interchangeably. A table is also called a relation. Summary of the Characteristics of a Relational Table 2 Table 2.1 A Listing of the STUDENT Table Attribute Values 2 Figure 2.1 Keys 2 Controlled redundancy (shared common attributes) makes the relational database work. The primary key of one table appears again as the link (foreign key) in another table. If the foreign key contains either matching values or nulls, the table(s) that make use of such a foreign key are said to exhibit referential integrity. Figure 2.2 2 An Example of a Simple Relational Database The Relational Schema for the CH2_SALE_CO Database 2 Figure 2.3 Keys 2 A key helps define entity relationships. The key’s role is based on a concept known as determination, which is used in the definition of functional dependence. The attribute B is functionally dependent on A if A determines B. An attribute that is part of a key is known as a key attribute. A multi-attribute key is known as a composite key. If the attribute (B) is functionally dependent on a composite key (A) but not on any subset of that composite key, the attribute (B) is fully functionally dependent on (A). Student Classification 2 Table 2.2 Relational Database Keys 2 Table 2.3 Integrity Rules Revisited 2 Table 2.4 Figure 2.4 2 An Illustration of Integrity Rules Relational Database Operators 2 The degree of relational completeness can be defined by the extent to which relational algebra is supported. Relational algebra defines the theoretical way of manipulating table contents using the eight relational functions: SELECT, PROJECT, JOIN, INTERSECT, UNION, DIFFERENCE, PRODUCT, and DIVIDE. Relational Database Operators UNION combines all rows from two tables. The two tables must be union compatible. 2 Figure 2.5 UNION Relational Database Operators 2 INTERSECT produces a listing that contains only the rows that appear in both tables. The two tables must be union compatible. Figure 2.6 INTERSECT Relational Database Operators 2 DIFFERENCE yields all rows in one table that are not found in the other table; i.e., it subtracts one table from the other. The tables must be union compatible. Figure 2.7 DIFFERENCE Relational Database Operators PRODUCT produces a list of all possible pairs of rows from two tables. 2 Figure 2.8 PRODUCT Relational Database Operators 2 SELECT yields values for all attributes found in a table. It yields a horizontal subset of a table. Relational Database Operators PROJECT produces a list of all values for selected attributes. It yields a vertical subset of a table. 2 Relational Database Operators 2 JOIN allows us to combine information from two or more tables. JOIN is the real power behind the relational database, allowing the use of independent tables linked by common attributes. Relational Database Operators Natural 2 JOIN links tables by selecting only the rows with common values in their common attribute(s). It is the result of a three-stage process: A PRODUCT of the tables is created. (Figure 2.12) A SELECT is performed on the output of the first step to yield only the rows for which the common attribute values match. (Figure 2.13) A PROJECT is performed to yield a single copy of each attribute, thereby eliminating the duplicate column. (Figure 2.14) Natural Join, Step 1: PRODUCT 2 Figure 2.12 Figure 2.13 Natural Join, Step 2: SELECT 2 Figure 2.14 Natural Join, Step 3: PROJECT Relational Database Operators EquiJOIN links tables based on an equality condition that compares specified columns of each table. The outcome of the EquiJOIN does not eliminate duplicate columns and the condition or criteria to join the tables must be explicitly defined. Theta JOIN is an equiJOIN that compares specified columns of each table using a comparison operator other than the equality comparison operator. In an Outer JOIN, the unmatched pairs would be retained and the values for the unmatched other tables would be left blank or null. 2 Outer JOIN 2 Figure 2.15 Relational Database Operators DIVIDE requires the use of one singlecolumn table and one two-column table. 2 Figure 2.16 DIVIDE The Data Dictionary and the System Catalog 2 Data dictionary contains metadata to provide detailed accounting of all tables within the database. System catalog is a very detailed system data dictionary that describes all objects within the database. System catalog is a system-created database whose tables store the database characteristics and contents. System catalog tables can be queried just like any other tables. System catalog automatically produces database documentation. A Sample Data Dictionary 2 Table 2.6 Relationships within the Relational Database E-R Diagram (ERD) 2 Rectangles are used to represent entities. Entity names are nouns and capitalized. Diamonds are used to represent the relationship(s) between the entities. The number 1 is used to represent the “1” side of the relationship. The letter M is used to represent the “many” sides of the relationship. The Relationship Between Painter and Painting 2 Figure 2.17 An Alternate Way to Present the Relationship Between Painter and Painting 2 Figure 2.18 A 1:M Relationship: The CH2_MUSEUM Database 2 Figure 2.19 The 1:M Relationship Between Course and Class 2 Figure 2.20 2 The M:N Relationship Between Student and Class 2 Figure 2.22 Sample Student Enrollment Data 2 Table 2.7 A Many-to-Many Relationship Between Student and Class 2 Figure 2.23 2 Changing the M:N Relationship to Two 1:M Relationships 2 Figure 2.25 The Expanded Entity Relationship Model 2 Figure 2.26 The Relational Schema for the Entity Relationship Diagram in Figure 2.26 2 Figure 2.27 Data Redundancy Revisited Proper use of foreign keys is crucial to exercising data redundancy control. 2 Database designers must reconcile three often contradictory requirements: design elegance, processing speed, and information requirements. (Chapter 4) Proper data warehousing design even requires carefully defined and controlled data redundancies, to function properly. (Chapter 13) 2 Figure 2.28 A Small Invoicing System Figure 2.29 The Relational Schema for the Invoicing System in Figure 2.28 2 The redundancy is crucial to the system’s success. Copying the product price from the PRODUCT table to the LINE table means that it is possible to maintain the historical accuracy of the transactions. Indexes An index is composed of an index key and a set of pointers. 2 Figure 2.30 Components of an Index