Chapter 3 Section 3.4 3 Relational Database Operators (Relational Algebra) Database Systems: Design, Implementation, and Management 7th Edition Peter Rob & Carlos Coronel What is Relational Algebra? 3 Part of Relational DB Theory Operations that any RDBMS should provide for data manipulation NOT directly included in products; capabilities generally provided via QBE or SQL What is included in RA? 3 Set Operations Operations specific to RDBs Recent Advanced Add-Ons All RA operations work on one or more relations, and produce a relation as a result UNION 3 Produces a resulting relation that contains a tuple for every tuple in either or both of two input relations (duplicates only occur once) The Relations being combined must be unioncompatible (type-compatible) e.g. CurrentEnrollments U HistoricalEnrollments; MailListFromSierraClub U MailListFromAudabonSoc; INTERSECTION 3 Produces a resulting relation that contains a tuple for every tuple in BOTH of two input relations The Relations being combined must be union-compatible (type-compatible) MailListFromSierraClub INTERSECT MailListFromNewBabyMag; SET DIFFERENCE (MINUS) 3 Produces a resulting relation that contains a tuple for every tuple in FIRST of two input relations AND NOT IN the second. The Relations being combined must be union-compatible (type-compatible) MailListFromMarketingCompany - CurrentCustomerList; CARTESIAN PRODUCT (TIMES) 3 Produces a resulting relation that contains all attributes in either input relation and a tuple for every possible combination of tuples in two input relations. The Relations being combined must be productcompatible BY ITSELF, not usually useful in the real world Relational Database Operators PRODUCT produces a list of all possible pairs of rows from two tables. 3 (old) Figure 2.8 PRODUCT RESTRICT/SELECT 3 Produces a resulting relation, containing only the tuples that meet some condition (hence a “horizontal” subset of the original relation) e.g. employees in department #4, students majoring in CS, students with a GPA < 2.0 PROJECTion 3 Produces a resulting relation, containing only the attributes that are requested (hence a “vertical” subset of the original relation) e.g. last name, first name and salary of employees; last name, major, year of students; dept, class of sections Relational Database Operators 3 JOIN allows us to combine information from two or more tables. JOIN is the real power behind the relational database, allowing the use of independent tables linked by common attributes. JOINS 3 Produces a resulting relation that contains all attributes in either input relation and a tuple for “every possible combination of tuples in two input relations that meet some condition”. Relational Database Operators Natural 3 JOIN links tables by selecting only the rows with common values in their common attribute(s). It is the result of a three-stage process: A PRODUCT of the tables is created. (Figure 3.12) A SELECT is performed on the output of the first step to yield only the rows for which the common attribute values match. (Figure 3.13) A PROJECT is performed to yield a single copy of each attribute, thereby eliminating the duplicate column. (Figure 3.14) Natural Join, Step 1: PRODUCT 3 Natural Join, Step 2: SELECT 3 Natural Join, Step 3: PROJECT 3 Natural Join (continued) Final outcome yields table that 3 Does not include unmatched pairs Provides only copies of matches If no match is made between the table rows, the new table does not include the unmatched row Relational Database Operators 3 EquiJOIN links tables based on an equality condition that compares specified columns of each table. The outcome of the EquiJOIN does not eliminate duplicate columns and the condition or criteria to join the tables must be explicitly defined. Theta JOIN is an equiJOIN that compares specified columns of each table using a comparison operator other than the equality comparison operator. In an Outer JOIN, the unmatched pairs would be retained and the values for the unmatched other tables would be left blank or null. Outer Join Matched pairs are retained and any unmatched values in other table are left null In outer join for tables CUSTOMER and AGENT, three scenarios are possible: 3 Left outer join Right outer join Yields all rows in CUSTOMER table, including those that do not have a matching value in the AGENT table Yields all rows in AGENT table, including those that do not have matching values in the CUSTOMER table FULL OUTER JOIN - keeps all tuples in first or second relation even if no matching tuples Left Outer Join 3 Right Outer Join 3 DIVISION 3 Useful for finding all X who are doing something with all Y e.g. find all students who are taking all three of these sections: 66416, 66417,66419 book e.g. find all locations that are associated with both codes A & B Relational Database Operators DIVIDE typically involves the use of one single-column table and one two-column table. 3 A Minimally Complete Set of RA Operations 3 RESTRICT, PROJECT, UNION, SET DIFFERENCE, CARTESIAN PRODUCT Others can be derived R INTERSECT S is equivalent to (R U S) - ((R -S) U (S - R)) Theta JOIN is equivalent to cartesian product followed by restrict NATURAL JOIN is equivalent to Cartesian product preceded by rename and followed by project and restrict Enhancements 3 Aggregate Functions - SUM, AVERAGE, MAXIMUM, MINIMUM Aggregates within Group e.g. GROUPING BY: Dept#; COUNT SSN, AVERAGE SALARY (Employee) - for each department give the count of # employees and the average salary A Final Word 3 While RA is not seen commercially, it is a foundation on what is available commercially. It is a commonly understood basis for comparison and for communication. End Relational Algebra 3