CSE 480: Database Systems Lecture 15: Relational Algebra Reference: Read Chapter 6.1-6.3.1 of the textbook ‹#› What is Relational Algebra? A formal way to express a query in relational model – A query consists of relational expressions describing the sequence of operators applied to obtain the query result Example : LName, Sex ( DNo=4 OR Salary>100000 (EMPLOYEE) ) – and are the operators ‹#› Why Study Relational Algebra? SELECT A.Name, B.Grade FROM A, B WHERE A.Id = B.Id Query processing in DBMS Name, Grade (Id,Name(A B)) ‹#› Relational Algebra Operator Output of a relational algebra expression is a relation Types of operators in a relational algebra expression – Unary: op(Relation) – Binary: Relation op Relation Relational algebra operators – select, project, union, set difference, Cartesian product ‹#› Select Operator () condition (R) Resulting relation (NOT THE SAME AS SELECT in SQL) – contains only tuples in R that satisfy the given condition – has the same attributes (columns) as the original relation Example: Status=‘Junior’(Student) Student Id 1123 1234 5556 9876 Name John Lee Mary Bart Address 123 Main 123 Main 7 Lake Dr 5 Pine St Status Junior Senior Freshman Junior Id 1123 9876 Name Address John 123 Main Bart 5 Pine St Status Junior Junior ‹#› Selection Condition select_condition (relation) – Selection condition acts as a filter to the rows in relation Selection condition is a Boolean expression – Simple selection condition: <attribute> operator <constant> <attribute> operator <attribute> where operator: <, , , >, =, – <condition> AND <condition> – <condition> OR <condition> – NOT <condition> ‹#› Selection Condition - Examples Student Id 1123 1234 5556 9876 Selectivity Name John Lee Mary Bart Address 123 Main 123 Main 7 Lake Dr 5 Pine St Status Junior Senior Freshman Junior Fraction of tuples selected by a selection condition Id>3000 OR Status=‘Freshman’ Id>3000 AND Id <3999 (Student) NOT(Status=‘Senior’) (Student) selectivity = 3/4 Status‘Senior’ (Student) selectivity = 3/4 (Student) selectivity = 2/4 selectivity = 0/4 ‹#› Selection Condition – Incorrect Queries Find professors who taught both CS315 and CS305 – CrsCode=‘CS315’ AND CrsCode=‘CS305’ (TEACHING) X Find courses taught by CS professors – PROFESSOR.Id=Teaching.ProfId AND PROFESSOR.DeptId=‘CS’ (TEACHING) X ‹#› Properties of SELECT operator Commutative: – <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R)) Cascade of SELECT operations may be applied in any order: – <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) Cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: – <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R))) ‹#› Project Operator Produces a relation containing a subset of columns of its input relation – attribute list(R) Example: Name,Status(Student) Student Id 1123 1234 5556 9876 Name Address John 123 Main Lee 123 Main Mary 7 Lake Dr Bart 5 Pine St Status Name Junior Freshman Senior Junior John Lee Mary Bart Status Junior Freshman Senior Junior ‹#› Project Operator Resulting relation has no duplicates; therefore it can have fewer tuples than the input relation Example: Address(Student) Student Id 1123 1234 5556 9876 Name John Lee Mary Bart Address 123 Main 123 Main 7 Lake Dr 5 Pine St Status Junior Freshman Senior Junior Address 123 Main 7 Lake Dr 5 Pine St ‹#› Properties of PROJECT Operator Number of tuples in the result of projection <list>(R) is always less or equal to the number of tuples in R – If list of projected attributes includes a superkey, the resulting relation has the same number of tuples as the input relation PROJECT operator is not commutative – <list1> ( <list2> (R) ) <list2> ( <list1> (R) ) ‹#› Relational Algebra Expressions Find the Ids and names of junior undergraduate students Student Id 1123 1234 5556 9876 Name Address John 123 Main Lee 123 Main Mary 7 Lake Dr Bart 5 Pine St Status Junior Freshman Senior Junior Id Name 1123 John 9876 Bart Result Id, Name ( Status=’Junior’ (Student) ) ‹#› Relational Algebra Expressions We may want to apply several relational algebra operations one after the other – We can write the operations as a single relational algebra expression by nesting the operations, or – We can apply one operation at a time and create intermediate result relations. – In the latter case, we must give names to the relations that hold the intermediate results. ‹#› Single versus Sequence of operations Query: Find the first name, last name, and salary of all employees who work in department number 5 We can write a single relational algebra expression: – FNAME, LNAME, SALARY( DNO=5(EMPLOYEE)) OR We can explicitly show the sequence of operations, giving a name to each intermediate relation: – DEP5_EMPS DNO=5(EMPLOYEE) – RESULT FNAME, LNAME, SALARY (DEP5_EMPS) ‹#› RENAME operator The RENAME operator is denoted by We may rename the attributes of a relation or the relation name or both – S (B1, B2, …, Bn )(R) Change the relation name from R to S, and Change column (attribute) names to B1, B2, …..Bn – S(R) : Change the relation name only to S – (B1, B2, …, Bn )(R) : Change the column (attribute) names only to B1, B2, …..Bn ‹#› Example Change relation name to Emp and attributes to First_name, Last_name, and Salary Emp(First_name, Last_name, Salary ) ( FNAME,LNAME,SALARY (EMPLOYEE)) or Emp(First_name, Last_name, Salary) FNAME,LNAME,SALARY (EMPLOYEE) ‹#› Set Operators A relation is a set of tuples; therefore set operations are applicable: – Intersection: – Union: – Set difference (Minus): Set operators are binary operators – Operator: Relation Relation Relation Result of set operation is a relation that has the same schema as the combining relations ‹#› Examples: Set Operations X Y A B A B x1 x2 x1 x2 x3 x4 x5 x6 XY XY X–Y A B A B A B x1 x2 x1 x2 x3 x4 x3 x4 x5 x6 ‹#› Example STUDENT INSTRUCTOR STUDENT – INSTRUCTOR STUDENT INSTRUCTOR – STUDENT INSTRUCTOR ‹#› Union Compatible Relations Set operations are limited to union compatible relations – Two relations are union compatible if: they have the same number of attributes, and the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n). The resulting relation has the same attribute names as the first operand relation ‹#› Example Tables: Student (SSN, Name, Address, Status) Professor (Id, Name, Office, Phone) are not union compatible. But Name (Student) and Name (Professor) are union compatible ‹#› Exercise Relations: – Employee (SSN, Name, Address) – Professor (SSN, Office, Phone) – Student (SSN, Status) Find the SSN of student employees SSN (Employee) SSN (Student) Find the SSN of employees who are not professors SSN (Employee) – SSN (Professor) Find the SSN of employees who are neither professors nor students SSN (Employee) – ( SSN (Student) SSN (Professor)) ‹#› Cartesian Product A B x1 x2 x3 x4 C D y1 y2 y3 y4 R S A x1 x1 x3 x3 B C x2 y1 x2 y3 x4 y1 x4 y3 R S D y2 y4 y2 y4 R S is expensive to compute: – Number of columns = degree(R) + degree(S) – Number of rows = number of rows (R) × number of rows (S) ‹#› Example Broker (BrokerId, BrokerName) BrokerID Client (ClientId, ClientName) 1 Merrill Lynch 2 Morgan Stanley 3 Salomon Smith Barney List all the possible Broker-Client pairs – Broker Client BrokerId BrokerName ClientId ClientName 1 Merrill Lynch A11 Bill Gates 1 Merrill Lynch B11 Steve Jobs 2 Morgan Stanley A11 Bill Gates 2 Morgan Stanley B11 Steve Jobs 3 Salomon Smith Barney A11 Bill Gates 3 Salomon Smith Barney B11 Steve Jobs BrokerName ClientId ClientName A11 Bill Gates B11 Steve Jobs ‹#› More Complex Examples Find the names of employees who are not supervisors Nonsup SSN Employee SuperSSN Employee Answer Fname , MInit , Lname Nonsup. SSN Employee. SSN Nonsup Employee ‹#› Example Find the names of employees who earn more than their supervisors Emp (SSN1, Sal) SSN , SalaryEmployee Answer Fname, Lname SuperSSN SSN1 ANDSalarySal Employee Emp ‹#› Summary (Relational Algebra Operators) Unary operators – SELECT condition(R) – PROJECT Attribute-List(R) – RENAME S(A1,A2,…Ak)(R) Set operators – RS – RS – R – S or S – R Cartesian product – RS ‹#›