Relational Algebra 1

advertisement
Relational Algebra –
Basis for Relational Query
Languages
Based on presentation by Juliana Freire
Formal relational query languages
Is this the Algebra you know?
Algebra -> operators and atomic operands
Expressions -> applying operators to atomic operands and/or other
expressions
Algebra of arithmetic: operands are variables and constants, and
operators are the usual arithmetic operators
E.g., (x+y)*2 or ((x+7)/(y-3)) + x
Relational algebra: operands are variables that stand for relations and
relations (sets of tuples), and operations include union, intersection,
selection, projection, Cartesian product, etc
– E.g., (π c-ownerChecking-account) ∩ (π s-ownerSavings-account)
What is a query?
A query is applied to relation instances, and the result of
a query is also a relation instance. (view, query)
– Schemas of input and output fixed, but instances not.
• Operators refer to relation attributes by position or name:
– E.g., Account(number, owner, balance, type)
– Positional notation easier for formal definitions, named-field
notation more readable.
– Both used in SQL
Relational Algebra Operations
The usual set operations: union, intersection, difference
•Operations that remove parts of relations:
selection, projection
•Operations that combine tuples from two relations:
Cartesian product, join
•Since each operation returns a relation,
operations can be composed!
Removing Parts of Relations
• Selection – rows
• Projection - columns
Selection: Example
Another selection
Example of Projection
Projection removes duplicates
Set Operations
• Union
• Intersection
• Difference
What happens when sets unite?
Union Operation – Example
• Relations r, s:A
B
A
B

1

2

2

3

1
s
r
 r  s:
A
B

1

2

1

3
Union Example
Union Compatibility
Intersection
Set Difference Operation –
Example
• Relations r, s:A
B
A
B

1

2

2

3

1
s
r
 r – s:
A
B

1

1
Difference
Another way to show intersection?
Summary so far:
•
•
•
•
•
•
E1 U E2 : union
E1 - E2 : difference
E1 x E2 : cartesian product
c(E1) : select rows, c = condition (book has p for predicate)
IIs(E1) : project columns : s =selected columns
 x(c1,c2) (E1) : rename, x is new name of E1, c1 is new name
of column
Combining Tuples of Two Relations
• Cross product (Cartesian product)
• Joins
Cartesian-Product Operation –
Example
 Relations r, s:
A
B
C
D
E

1

2




10
10
20
10
a
a
b
b
r
s
 r x s:
A
B
C
D
E








1
1
1
1
2
2
2
2








10
10
20
10
10
10
20
10
a
a
b
b
a
a
b
b
Cross Product Example
Cross Product
• How to resolve????
Renaming operator: 
Rename whole relation: Teacher X secondteacher(Teacher) 
Teacher.t-num, Teacher.t-name, secondteacher.t-num, secondteacher.t-name
OR rename attribute before combining:
Teacher X secondteacher(t-num2, t-name2)(Teacher)
t-num, t-name, t-num2, t-name2
OR rename after combining

c(t-num1, t-name1, t-num2, t-name2)(Teacher X Teacher)
t-num1, t-name1, t-num2, t-name2
Join: Example
Join : Example
Condition Join
Equi and Natural Join
Divide operator
Divide Operation
Divide Definition
When to divide?
Division Example
Dividing without division sign
Working out an example
Assignment operation
Why would we use Relational
Algebra?????
Equivalencies help
ER vs RA
• Both ER and the Relational Model can be
used to model the structure of a database.
• Why is it the case that there are only
Relational Databases and no ER
databases?
RA vs Full Programming Language
• Relational Algebra is not Turing complete.
There are operations that cannot be
expressed in relational algebra.
• What is the advantage of using this
language to query a database?
Summary of Operators updated
•
•
•
•
•
•
•
Summary so far:
E1 U E2 : union
E1 - E2 : difference
E1 x E2 : cartesian product
c(E1) : select rows, c = condition (book has p for predicate)
IIs(E1) : project columns : s =selected columns
 x(c1,c2) (E1) : rename, x is new name of E1, c1 is new name of
column
• E1  E2 : division
• E1
E2 : join, c = match condition
Practice
• Find names of stars who’ve appeared in a 1994
movie
• Information about movie year available in Movies; so
need an extra join:
σyear=1994(πname(Stars ⋈ AppearIn ⋈ Movies))
• A more efficient solution:
πname(Stars ⋈ AppearIn ⋈ (σyear=1994( Movies))
• An even more efficient solution:
πname(Stars ⋈
πname(AppearIn ⋈ (πmovieIdσyear=1994(Movies)))
A query optimizer can find this, given the first solution!
Extended Relational Algebra
Operations
• Generalized projection
• Outer join
• Aggregate functions
Generalized projection – calculate
fields
Aggregate Operation – Example
• Relatio
n r:
 g sum(c) (r)
A
B
C








7
sum(c )
27
7
3
10
Aggregate
• Functions on more than one tuple
• Samples:
–
–
–
–
–
–
Sum
Count-distinct
Max
Min
Count
Avg
• Use “as” to rename
branchname
g sum(balance) as totalbalance (account)
Aggregate Operation – Example
• Relation account
by branch-name:
branch_namegrouped
account_number
balance
Perryridge
Perryridge
Brighton
Brighton
Redwood
branch_name
g
A-102
A-201
A-217
A-215
A-222
400
900
750
750
700
sum(balance) (account)
branch_name
Perryridge
Brighton
Redwood
sum(balance)
1300
1500
700
Outer Join
• Keep the outer side even if no join
• Fill in missing fields with nulls
Outer Join – Example
• Relation loan
loan_number branch_name
L-170
L-230
L-260
Downtown
Redwood
Perryridge
amount
3000
4000
1700
 Relation borrower
customer_name loan_number
Jones
Smith
Hayes
L-170
L-230
L-155
Outer Join – Example
• Inner Join
loan
Borrower
loan_number
branch_name
L-170
L-230
Downtown
Redwood
amount customer_name
3000
4000
Jones
Smith
 Left Outer Join
loan
Borrower
loan_number
branch_name
L-170
L-230
L-260
Downtown
Redwood
Perryridge
amount customer_name
3000
4000
1700
Jones
Smith
null
Outer Join – Example
 Right Outer Join
loan
borrower
loan_number
branch_name
L-170
L-230
L-155
Downtown
Redwood
null
amount customer_name
3000
4000
null
Jones
Smith
Hayes
 Full Outer Join
loan
borrower
loan_number
branch_name
L-170
L-230
L-260
L-155
Downtown
Redwood
Perryridge
null
amount customer_name
3000
4000
1700
null
Jones
Smith
null
Hayes
Summary of Operators - Full
•
•
•
•
•
•
E1 U E2 : union
E1 - E2 : difference
E1 x E2 : cartesian product
c(E1) : select rows, c = condition (book has p for predicate)
IIs(E1) : project columns : s =selected columns separated by commas,
can have calculations included
 x(c1,c2) (E1) : rename, x is new name of E1, c1 is new name of column
•
E1  E2 : division
•
E1
•
•
•
E1
E2 : outer join, c = match condition, keep the side with the arrows
 : assignment – give a new name to an expression to make it easy to read
as : rename a calculated column
•
attribute1g function (attribute2) (E1) : perform function on attribute2
whenever attribute1 changes
E2 : join, c = match condition
Download