Relational Algebra

advertisement
Discussion #23
Relational Algebra
Discussion #23
1/32
Topics
• Algebras
• Relational Algebra
–
–
–
–
–
–
–
use of standard notation
set operators , , 
renaming 
selection 
projection 
cross product 
join ||
• Queries (from English)
• Query optimization
• SQL
Discussion #23
2/32
Relational Algebra
• What is an algebra?
– a pair: (set of values, set of operations)
–  ADT  type  Class  Object
e.g. stack: (set of all stacks, {pop, push, top, …})
integer: (set of all integers, {+, -, *, })
• What is relational algebra?
– (set of relations, set of relational operators)
– {, , , , , , , ||}
Discussion #23
3/32
Relational Algebra is Closed
• Closed: all operations produce values in the value set
–
–
–
–
–
(reals, {+, *, })  closed
(reals, {+, *, , })  not closed (divide by 0)
(reals, {+, *, >})  not closed (T/F not in value set)
(computer reals, {+, *, })  not closed (overflow, roundoff)
(relations, relational operators)  closed
• Implication: we can always nest relational operators;
can’t for algebras that are not closed.
– e.g. after overflow, can do nothing
– e.g. can’t always nest: (2 < 3) + 5 = ?
Discussion #23
4/32
Set Operations: , , and 
• Relations are sets; thus set operations should work.
• Examples:
R= A
1
2
2
RS = A
1
2
2
4
5
Discussion #23
B
2
2
3
2
5
B
2
2
3
RS = A B
2 2
2 3
S= A
2
2
4
5
B
2
3
2
5
RS = A B
1 2
SR = A B
4 2
5 5
5/32
Set Operations (continued …)
•
Definition: schema(R) = {A, B} = AB, i.e. the
set of attributes
• We sometimes write R(AB) to mean the relation
R with schema AB.
• Definition: union compatible
– schema(R) = schema(S)
– required precondition for , , 
• Definitions:
– R  S = { t | t  R  t  S}
– R  S = { t | t  R  t  S}
– R  S = { t | t  R  t  S}
Discussion #23
6/32
Tuple Restriction: [X]
• Restriction is a tuple operator (not a relational
operator).
• t[X] restricts tuple t to the attributes in X.
A B C
t=1 2 3
t[A] = (1)
t[AC] = (1,3)
t = (1,2,3)
t[A] = (1,2,3)[A]
= {(A,1), (B, 2), (C,3)}[A]
= {(A,1)}
= (1)
Discussion #23
7/32
Renaming: 
• ABR renames attribute A to be B.
– A must be in schema(R)
– B must not be in schema(R)
• Example: let
• But with :
R =A B
Q =A C
RQ = ?
1 2
2 2
2 3
2 2
3 2
Not union
compatible
CBQ = A B
2 2
3 2
Discussion #23
RCBQ = A B
1
2
2
3
2
2
3
2
8/32
Renaming (continued…)
• Q = ABR renames attribute A to B; the result is Q.
• Precondition:
– A  schema(R)
– B  schema(R)
• Postcondition:
– schema(Q) = (schema(R)  {A})  {B}
– Q = {t' | t (tR  t' = (t – {(A, t[A])})  {(B, t[A])})}
R = {{(A,1), (C,2)}
{(A,2), (C,2)}}
Discussion #23
Q = ABR = {{(B,1), (C,2)}
{(B,2), (C,2)}}
9/32
Selection: 
• The selection operation selects the tuples that
satisfy a condition.
R =A B
1 2
2 2
2 3
A=1R = A B
1 2
B=2R = A B
PR = { t | t  R  P(t) }
Meaning: apply predicate P to tuple t by
substituting into P appropriate t values.
1 2
2 2
A=2B2R = A B
2 2
2 3
A=3R = A B
Note: empty, but
still retain the schema
• Precondition: each attribute mentioned in P must
be in schema(R).
• Postcondition: PR = { t | t  R  P(t) }
schema(PR) = schema(R)
Discussion #23
10/32
Projection: 
The projection operation restricts tuples in a
relation to those designated in the operation.
AR = A
R =A B
1
2
2
Q =A
1
2
3
2
2
3
B
1
1
4
1
2
C
1
1
5
BR = B
2
3
ABR = R = A,BR = {A,B}R
BCQ = B C
1 1
4 5
Precondition: X  schema(R)
Postcondition: XR = { t' | t (t  R  t' = t[X]) }
schema(XR) = X
Discussion #23
11/32
Cross Product: 
Standard cartesian product adapted for
relational algebra
R =A B
S=C D
1 2
2 2
1 1
2 2
3 3
Discussion #23
R  S =A B C D
1
1
1
2
2
2
2
2
2
2
2
2
1
2
3
1
2
3
1
2
3
1
2
3
12/32
Cross Product (continued…)
Precondition: schema(R)  schema(S) = 
Postcondition: R  S = { t | t' t''(t' R  t'' S  t = t'  t'')}
schema(R  S) = schema(R)  schema(S)
R =A B
1 2 = t'
2 2
t' = { (A,1), (B,2) }
S=C D
1 1
2 2
3 3 = t''
t'' = { (C,3), (D,3) }
t'  t'' = { (A,1), (B,2), (C,3), (D,3) }
Discussion #23
13/32
Cross Product (continued…)
What if R and S have the same attribute, e.g. A?
S=C A
R =A B
1 1 = t'' = { (C,1), (A,1) }
2 2
3 3 = t''' = { (C,3), (A,3) }
1 2 = t' = { (A,1), (B,2) }
2 2
Can’t do cross product
Solution: Rename
AAS
t'  t'' = { (A,1), (B,2), (C,1), (A,1) }
R  AAS = A B C A
1
1
1
2
2
2
Discussion #23
2
2
2
2
2
2
1
2
3
1
2
3
1
2
3
1
2
3
14/32
Natural Join: ||
R =A B
S=B C
R || S = A B C
1 2
2 2
1 2
2 1
3 2
1 2 1
2 2 1
Cross Product
R || S = ABC B=B' (R  BB'S )
Projection
Discussion #23
Selection
A
1
11
1
2
22
2
B
2
2
2
2
22
2
B'
1
2
3
1
2
3
C
2
11
2
2
11
2
Renaming
15/32
Join (continued …)
• In general, we can equate 0, 1, 2, or more
attributes using || .
• A join is defined as:
schema (R || S) = schema(R)  schema(S)
R || S = {t | t[schema(R)]  R
 t[schema(S)]  S}
• There are no preconditions  join always
works.
Discussion #23
16/32
Join (continued…)
0 attributes in common (full cross product)
R =A B
S=C D
1 1
2 3
4 1
1 1
1 5
1 attribute in common
R || S = A B C D
1
1
2
2
4
4
1
1
3
3
1
1
1
1
1
1
1
1
R =A B
S=B C
R || S = A B C
1 2
2 2
2 3
1 1
2 2
3 3
1 2 2
2 2 2
2 3 3
2 attributes in common
R =A B C
1 2 3
2 2 4
2 3 5
Discussion #23
1
5
1
5
1
5
S =A B D
R || S = A B C D
1 1 1
2 2 2
2 2 1
2 2 4 2
2 2 4 1
17/32
Join (continued…)
• We can use renaming to control the ||
R =A B
S=B C
1 2
2 2
1 2
2 1
3 2
S' = B A = A B
1 2
2 1
3 2
2 1
1 2
2 3
R || CAS = A B
1
2
R || S' = A B
1 2
• BTW, observe equivalence with intersection
Discussion #23
18/32
Relational Algebra Expressions
• Relational operators are closed. Thus we can nest
expressions:
R =A B
1
3
2
4
S=B C D
2
2
3
4
5
7
2
5
1
2
3
4
DC=5(R || S) = A B C D
1 2 5 1
1 2 7 2
3 4 5 4
= D
1
4
• Unary operators have precedence over binary
operators; binary operators are left associative.
• We can now do something very useful: ask and
answer with relational algebra (almost) any query
we can dream up.
Discussion #23
19/32
Relational Algebra Queries
• List the prerequisites for EE200.
PrerequisiteCourse='EE200'cp = Prerequisite
EE005
CS100
• When does CS101 meet?
Day,HourCourse='CS101'cdh = Day Hour
M
W
F
9AM
9AM
9AM
• When and where does EE200 meet?
Day,Hour,RoomCourse='EE200'(cdh || cr) = Day Hour Room
Our answers are in (cdh || cr).
We select Course to be EE200.
Then, project on Day, Hour, Room.
Discussion #23
Tu 10AM 25 Ohm Hall
W 1PM 25 Ohm Hall
Th 10AM 25 Ohm Hall
20/32
Queries (continued…)
• Where can I find Snoopy at 9 am on Monday?
StudentID Name'Snoopy' Address
Course StudentID Grade
Course Room*
Course Day'M' Hour'9AM'
Phone
RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)
= Room
• Can we rewrite the query more optimally?
• What rules should we use?
Turing Aud.
– Associativity and commutivity of join
– Distributive laws for select and project
• What strategy should we use?
– Eliminate unnecessary operations
– Make joins as small as possible before execution
Discussion #23
21/32
Query Optimization
• “Intuitively” we can write
RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)
as
Room(Name='Snoopy'snap || csg || cr || Day='M'  Hour='9AM'cdh)
• Why does this execute faster?
• What laws hold that will let us do this?
R || S = S || R
P1P2E = P1P2E
P(R |×| S) = R || PS (if all the attributes of P are in S)
• How do we know they hold?
Discussion #23
22/32
Proofs for Laws
•
•
To prove P1P2E = P1P2E, we need to prove that
two sets are equal. We prove A = B by showing AB 
BA. We show that AB by showing that xA  xB.
Thus, we can do two proofs to prove P1P2E =
P1P2E as follows:
1.
2.
3.
4.
5.
6.
t  P1P2E
t  E  (P1P2)(t)
t  E  P1(t)  P2(t)
t  E  P2(t)  P1(t)
t  P2E  P1(t)
t  P1P2E
1. t  P1P2E
2. …
Discussion #23
premise
def.: PR = {t | tR  P(t)}
identical substitutions & operations
commutative
def. of 
def. of 
premise
just go backwards from 6 to 1 in
the proof above
23/32
Alternate Proof
(Derive the right-hand side from the left-hand side.)
Thus, we can prove P1P2E = P1P2E as follows:
P1P2E
= {t | t  E  (P1P2)(t)}
= {t | t  E  P1(t)  P2(t)}
= {t | t  E  P2(t)  P1(t)}
= {t | t  P2E  P1(t)}
= {t | t  P1P2E}
= P1P2E
Discussion #23
def.: PR = {t | tR  P(t)}
identical substitutions & operations
commutative
def. of 
def. of 
def. of a relation
24/32
Proofs for Laws (continued …)
•
•
To prove P(R || S) = R || PS, where all attributes of P are
in S, we again need to prove that two sets are equal.
As before, we can convert the lhs to the rhs.
P(R || S)
= {t | t  P(R || S)}
def. of a relation
= {t | t  R || S  P(t)}
def.: PR={t | tRP(t)}
= {t | t[schema(R)]  R  t[schema(S)]  S  P(t)}
def.: R||S={t | t[schema(R)]Rt[schema(S)]S}
= {t | t[schema(R)]  R 
t[schema(S)]  S  P(t[schema(S)])}
all attributes of P are in S
= {t | t[schema(R)]  R  t[schema(S)]  PS}
= {t | t  R || PS}
= R || PS
Discussion #23
def. of 
def. of ||
def. of a relation
25/32
SQL
Correspondence with Relational Algebra
Assume we have relations R(AB) and S(BC).
A B = 1 R
select A
from R
where B = 1
select B from R
except
select B from S
B R  B S
select A, R.B, C
from R, S
where R.B = S.B
A, R.B, C R.B = S.B (R  S)
= R || S
Discussion #23
26/32
SQL
Correspondence with Relational Algebra
Assume we have relations R(AB) and S(BC).
A B = 1 R
select A
from R
where B = 1
select R.B from R
where R.B not in
(select S.B from S)
B R  B S
select *
from R natural join S
R || S
Discussion #23
27/32
SQL Queries
• List the prerequisites for EE200.
select Prerequisite
from cp
where Course='EE200'
Prerequisite
EE005
CS100
• When does CS101 meet?
select Day, Hour
from cdh
where Course= 'CS101'
Day
M
W
F
Hour
9AM
9AM
9AM
• When and where does EE200 meet?
select cdh.Course, Day, Hour, Room
from cdh, cr
where cdh.Course= 'EE200'
and cdh.Course=cr.Course
Discussion #23
Course
EE200
EE200
EE200
Day
Tu
W
Th
Hour
10AM
1PM
10AM
Room
25 Ohm Hall
25 Ohm Hall
25 Ohm Hall
28/32
SQL Queries
• List the prerequisites for EE200.
select Prerequisite
from cp
where Course='EE200'
Prerequisite
EE005
CS100
• When does CS101 meet?
select Day, Hour
from cdh
where Course= 'CS101'
Day
M
W
F
Hour
9AM
9AM
9AM
• When and where does EE200 meet?
select Course, Day, Hour, Room
from cdh natural join cr
where cdh.Course= 'EE200'
Discussion #23
Course
EE200
EE200
EE200
Day Hour
Tu 10AM
W 1PM
Th 10AM
Room
25 Ohm Hall
25 Ohm Hall
25 Ohm Hall
29/32
SQL Queries
• List all prerequisite courses.
select Prerequisite
from cp
Prerequisite
CS100
EE005
CS100
CS101
CS120
CS101
CS121
CS205
select distinct Prerequisite
from cp
Prerequisite
CS100
CS101
CS120
CS121
CS205
EE005
Discussion #23
30/32
SQL Queries
• Where can I find Snoopy at 9 am on Monday?
select Room
Room
from snap, csg, cr, cdh
Turing Aud.
where Name='Snoopy' and Day='M'
and Hour='9AM' and snap.StudentID=csg.StudentID
and csg.Course=cr.Course and cr.Course=cdh.Course
• List all prereqs of CS750 (including prereqs of prereqs.)
• Not possible with standard SQL (unless nesting depth is known)
• Is possible with Datalog
Rules: prereqOf(x, y) :- cp(y, x).
prereqOf(x, y) :- prereqOf(x, z), cp(y, z).
Query: prereqOf(x, 'CS750')?
• To gain more power and flexibility, we typically embed SQL in
a high-level language.
Discussion #23
31/32
SQL Queries
• List all prereqs of CS750 (including prereqs of prereqs.)
select cp.Prerequisite
from cp
where cp.Course = 'CS750'
union
select cp1.Prerequisite
from cp cp1, cp cp2
where cp1.Course = cp2.Prerequisite
and cp2.Course = 'CS750'
union
…
Discussion #23
32/32
Download