RELATIONAL ALGEBRA Prof. Sin-Min LEE Department of Computer Science Terminology: table (relation) row (tuple) formally, a relation is a set of ordered n- tuples. Domain : set of data values from which an attribute value can be drawn. Eg. set of department names set of legal salaries set of possible employee birth dates If D1, D2, D3……. Dn are the domains of a relation R. Then, R D1 x D2x D3 x ………..x Dn. It is an abstract language. We use it to express the set of operations that any relational query language must perform. Two types of operations: 1.set-theoretic operations: tables are essentially sets of rows 2.native relational operations: focus on the structure of the rows Query languages are specialized languages for asking questions,or queries,that involve the data in database. Database Scheme A relational database scheme, or schema, corresponds to a set of table definitions. Eg: product(p_id, name, category, description) supply(p_id, s_id, qnty_per_month) supplier(s_id, name, address, ph#) * remember the difference between a DB instance and a DB scheme. Keys The super key, candidate key and primary key notations presented for ER model apply for relational model also. Query languages procedural vs. non-procedural commercial languages have some of both we will study: relational algebra (which is procedural, i.e. tells you how to process a query) relational calculus (which is non-procedural i.e. tells what you want) Relational Algebra Fundamental operators select project cartesian product union set difference s p - Other operators natural join set intersection division JOIN (butterfly symbol) A Simple DB account had transaction account ac# owner ss# balance 1 bob 123 1000 2 sue 456 2000 3 jane 789 3000 transaction t# ac# type amount outcome 1 1 W 1500 bounced 2 2 D 1000 ok 3 1 W 100 ok 4 3 D 500 ok 5 2 W 200 ok date 5/1/98 5/2/98 5/4/98 5/7/98 5/9/98 Select s balance>=1500 account eg: result : ac# owner ss# balance 2 sue 456 2000 3 jane 789 3000 Project eg: π result: owner, ss# account owner bob sue jane ss# 123 456 789 Cartesian product eg: account transaction this will have 15 rows like the ones shown below: ac# owner ss# balance t# type amount outcome date 1 bob 123 1000 1 W 1500 bounced 5/1/98 2 sue 456 2000 2 D 1000 ok 5/2/98 …………… Composing operations eg: “show all transactions done by account owner Bob”. σ account.ano= transaction.ano((s owner=“Bob” account) transaction) Natural Join - combines σ, π, - very commonly used Natural Join forms the cross product of its two arguments, does a selection to enforce equality of columns with the same name and removes duplicate columns. Eg: “show all transactions done by account owner Bob” σ owner=“Bob” (account JOIN transaction) Rename operation What if you need to access the same relation twice in a query? eg. person(ss#, name, mother_ss#, father_ss#) “Find the name of Bob’s mother” needs the “person” table to be accessed twice. The operation ρ x (r) evaluates to a second logical copy of relation r renamed to x. Rename operation (contd) eg: π mother.name (ρ ( mother (person)) JOIN mother.ss# = person.mother_ss# (s name=“Bob” (person))) Additional Operations Additional Operations are those that can be expressed in terms of other operations . r Set Intersection eg.: r a 1 2 3 b a b d rs= s a 1 2 3 r-(r-s) b a c d s rs= a b 1 a 3 d Additional Operations(cntd.) Division useful for “for all” queries Definition: Let r(R) and s(S), where R & S are sets of attributes, be relations, where S is a subset of R. The relation r s has scheme R-S. The tuples in r s consist of the R-S part of the tuples of r such that some tuple tr in r with the those R-S attribute values matches every tuple in s. Can also be defined in terms of relational algebra. Additional Operations(cntd.) Assignment operation Sometimes it is convenient to write a relational algebra expression as a sequence of steps rather than one large expression. To do this, you can use assignment: relname expression eg.: temp π pno (part) bigsuppliers supply temp