Notes

advertisement
CMSC424: Database Design
Lecture 5
CMSC424, Spring 2005
Review: Relational Algebra
Relational Algebra Operators
1. Select ()
2. Project ()
3. Set Union (U)
4. Set Difference (-)
5. Cartesian Product ()
6. Rename ()
These are called fundamental operations
CMSC424, Spring 2005
Relational Algebra
Redundant Operators
1. Natural Join (  )
2. Division (  )
3. Outer Joins (  
 )
4. Update (  )
CMSC424, Spring 2005
Natural Join
Notation: Relation1  Relation2
Idea: match tuples on common attributes
A B C
D
E
B
D
A B C
D
E
α
α
α
β
10
10
20
10
‘a’
‘a’
‘b’
‘c’
α
α
β
β
10
20
10
10
1
2
2
3
3
α
α
α
β
β
10
10
20
10
10
‘a’
‘a’
‘a’
‘b’
‘c’
1
2
2
3
+
+
r

=
s
CMSC424, Spring 2005
+
+
+
Division
Notation: Relation1  Relation2
Idea: expresses “for all” queries
Query: Find customers who have accounts in all branches
in Brooklyn
r1  all branches in Brooklyn
r2  associate customers with branches they
have accounts in
Now what ?
Use the division operator
CMSC424, Spring 2005
Outer Joins
Motivation:
loan =
bname
lno
Downtown
Redwood
Perry
L-170
L-230
L-260
amt
cname
lno
3000 borrower =
4000
1700
Jones
Smith
Hayes
L-170
L-230
L-155
=
loan

borrower =
bname
lno
amt
cname
Downtown
Redwood
L-170
L-230
3000
4000
Jones
Smith
Join result loses…
 any record of Perry
 any record of Hayes
CMSC424, Spring 2005
Outer Joins
loan =
bname
lno
amt
Downtown
Redwood
Perry
L-170
L-230
L-260
3000
4000
1700
borrower =
cname
lno
Jones
Smith
Hayes
L-170
L-230
L-155
1. Left Outer Join (  )
• preserves all tuples in left relation
loan
 borrower
bname
Downtown
Redwood
Perry
=
lno
amt
cname
L-170 3000
L-230 4000
L-260 1700
Jones
Smith
┴
CMSC424, Spring 2005
┴ = NULL
Outer Joins
loan =
bname
lno
amt
Downtown
Redwood
Perry
L-170
L-230
L-260
3000
4000
1700
borrower =
cname
lno
Jones
Smith
Hayes
L-170
L-230
L-155
2. Right Outer Join (  )
• preserves all tuples in right relation
loan

borrower =
bname
lno
amt
cname
Downtown
Redwood
┴
L-170
L-230
L-155
3000
4000
┴
Jones
Smith
Hayes
CMSC424, Spring 2005
┴ = NULL
Outer Joins
loan =
bname
lno
amt
Downtown
Redwood
Perry
L-170
L-230
L-260
3000
4000
1700
borrower =
cname
lno
Jones
Smith
Hayes
L-170
L-230
L-155
3. Full Outer Join (  )
• preserves all tuples in both relations
loan

borrower =
bname
lno
amt
cname
Downtown
Redwood
Perry
┴
L-170
L-230
L-260
L-155
3000
4000
1700
┴
Jones
Smith
┴
Hayes
CMSC424, Spring 2005
┴ = NULL
Update
Notation: Identifier  Query
Common Uses:
1. Deletion: r  r – s
e.g., account  account – σbname=Perry (account)
(deletes all Perry accounts)
2. Insertion: r  r  s
e.g., branch  branch  {(Waltham, Boston, 7M)}
(inserts new branch with
bname = Waltham, bcity = Boston, assets = 7M)
3. Update: r  πe1,…,en (r)
e.g., account  πbname,acct_no,bal*1.05 (account)
(adds 5% interest to account balances)
CMSC424, Spring 2005
Extended Relational Algebra
1. Generalized projection
2. Aggregates
CMSC424, Spring 2005
Generalized Projection
Notation:  e ,…,e (Relation)
1
n
e1,…,en can include arithmetic expressions – not just attributes
Example
credit =
cname
limit
balance
Jones
Turner
5000
3000
2000
2500
Then…
π cname, limit - balance (credit) =
cname
limit-balance
Jones
Turner
3000
500
CMSC424, Spring 2005
Generalized Projection
Notation:  e ,…,e (Relation)
1
n
e1,…,en can include arithmetic expressions – not just attributes
Example
credit =
cname
limit
balance
Jones
Turner
5000
3000
2000
2500
Then…
π cname, limit - balance as limitbalance (credit) =
CMSC424, Spring 2005
cname
limitbalance
Jones
Turner
3000
500
Aggregate Functions and Operations
Aggregation function takes a collection of values and
returns a single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
CMSC424, Spring 2005
Aggregate Operation – Example
Relation r:
g sum(c) as sumC (r)
A
B
C








7
7
3
10
sum-C
27
CMSC424, Spring 2005
Aggregate Functions and Operations
General form:
G1, G2, …, Gn
g F1( A1), F2( A2),…, Fn( An) (E)
E is any relational-algebra expression
G1, G2 …, Gn is a list of attributes on which to group (can be empty)
Each Fi is an aggregate function
Each Ai is an attribute name
CMSC424, Spring 2005
Aggregate Operation – Example
Relation account grouped by branch-name:
branch-name account-number
Perryridge
Perryridge
Brighton
Brighton
Redwood
branch-name
g
balance
A-102
A-201
A-217
A-215
A-222
sum(balance)
400
900
750
750
700
(account)
branch-name
Perryridge
Brighton
Redwood
CMSC424, Spring 2005
balance
1300
1500
700
Other Theoretical Languages
Relational Calculus
Non-procedural
Tuple relational calculus
Examples
Safety
Domain relational calculus
CMSC424, Spring 2005
Review: Query Languages
Relational
Algebra
Theoretical Use
Practical Use
Formal semantics of
practical QL’s
Internal query representation for
query optimizers
Foundation for SQL
TRC
Language Expressivity
SQL-92
-
Standard for Relational DB
Query Languages
SQL-99
-
Standard for Object-Relational
DB Query Languages
OQL
-
Standard for Object-Oriented
DB Query Languages
XQuery
-
Standard for XML-based DB
Query Languages
CMSC424, Spring 2005
SQL - Introduction
Standard DML/DDL for relational DB’s
• DML = Data Manipulation Language (queries, updates)
• DDL = Data Definition Language (create tables, indexes, …)
Also includes
•
•
•
•
View definition
Security (Authorization)
Integrity constraints
Transactions
History
• Early 70’s, IBM system R project (SEQUEL)
• Later, become standard (Structured Query Language)
CMSC424, Spring 2005
SQL: Basic Structure
SELECT
FROM
WHERE P
A1, ….., An
r1, ….., rm
Equivalent to:
 A1,A2,…,An (σP (r1 …  rn ))
CMSC424, Spring 2005
A Simple SELECT-FROM-WHERE Query
SELECT
FROM
WHERE
Similar to
bname
loan
amt > 1000
 bname (  amt
> 1000 (loan)
But not quite
bname
Redwood
Perry
Downtown
Perry
)
Why preserve duplicates?
Can instead write :
SELECT DISTINCT bname
FROM
loan
WHERE
amt > 1000
Duplicates are retained
(i.e., result not a set)
(removes duplicates from result)
We will discuss bag algebra a bit later
CMSC424, Spring 2005
Another SELECT-FROM-WHERE Query
SELECT
FROM
WHERE
Similar to
cname, balance
depositor, account
depositor.acct_no = account.acct_no
 cname,
Returns:
cname
Johnson
Smith
Hayes
Turner
Johnson
Jones
Lindsay
balance
(depositor
account )
Note:
balance
500
700
400
350
900
750
700
Can also write
SELECT
FROM
WHERE
d.cname, a.balance
depositor as d,
account as a
d.acct_no = a.acct_no
(neccessary for self-joins)
CMSC424, Spring 2005
The SELECT Clause
• Equivalent to (generalized) projection, despite name
• Can use ‘*’ to get all attributes
e.g:
SELECT
FROM
*
loan
• Can write SELECT DISTINCT to eliminate duplicates
• Can write SELECT ALL to preserve duplicates (default)
• Can include arithmetic expressions
e.g:
SELECT
FROM
bname, acct_no, balance*1.05
account
CMSC424, Spring 2005
The FROM Clause
• Equivalent to cartesian product ()
(or
, depending on WHERE clause)
• Binds tuples in relations to variable names
e.g: FROM borrower, loan
• Computes borrower  loan
• Identifies borrower, loan columns in result, allowing one to
write
WHERE borrower.lno = loan.lno
e.g: FROM borrower as b, loan as l
allows one to write
WHERE b.lno = l.lno
CMSC424, Spring 2005
The WHERE Clause
• Equivalent to Selection, despite name
• WHERE predicate can be:
1. Simple
attribute relop attribute (or constant)
(relop: =, <>, <, >, <=, >=)
2. Complex (using AND, OR, NOT, BETWEEN)
e.g:
SELECT
FROM
WHERE
lno
loan
amt BETWEEN 90000 AND 100000
is the same as…
SELECT
FROM
WHERE
lno
loan
amt >= 90000 AND amt <= 100000
CMSC424, Spring 2005
Data Definition Language
Allows specification of relation schema as well as:
Attribute domains
Integrity constraints
Security and authorization information
Creation of Indexes
…
CMSC424, Spring 2005
DDL
CREATE TABLE branch
(branch-name char(15) not null,
branch-city char(30),
assets integer,
primary key (branch-name),
check (assets >= 0))
DROP TABLE branch
ALTER TABLE branch ADD zipcode integer
CMSC424, Spring 2005
Download