CMSC424: Database Design Instructor: Amol Deshpande

advertisement
CMSC424: Database
Design
Instructor: Amol Deshpande
amol@cs.umd.edu
Today


Mapping relational algebra to SQL
Complex SQL Queries



See “movies database sql queries” on class
webpage for the actual queries
SQL Formal Semantics
Advanced SQL Features
SQL Query Examples

Movie(title, year, length, inColor, studioName, producerC#)

StarsIn(movieTitle, movieYear, starName)

MovieStar(name, address, gender, birthdate)

MovieExec(name, address, cert#, netWorth)

Studio(name, address, presC#)

Queries:

Producer with maximum average length of movies

Find producer of Star Wars.

All producers of movies in which harrison ford stars
SQL Query Examples

Movie(title, year, length, inColor, studioName, producerC#)

StarsIn(movieTitle, movieYear, starName)

MovieStar(name, address, gender, birthdate)

MovieExec(name, address, cert#, netWorth)

Studio(name, address, presC#)

Queries:

Find movie titles that appear more than once

Find number of people 3 hops away from Kevin Bacon
More SQL


Select *
into temptable
from X1, …
Having

WHERE is to FROM what HAVING is to
GROUPBY
Duplicates

By definition, relations are sets


Problem:



Not practical to remove duplicates after every operation
Why ?
So…


So  No duplicates allowed
SQL by default does not remove duplicates
SQL follows bag semantics, not set semantics

Implicitly we keep count of number of copies of each tuple
Formal Semantics of SQL
RA can only express SELECT DISTINCT queries
•
To express SQL, must extend RA to a bag algebra
 Bags (aka: multisets) like sets, but can have duplicates
e.g: {5, 3, 3}
e.g: homes =
cname
ccity
Johnson
Smith
Johnson
Smith
Brighton
Perry
Brighton
R.H.
• Next: will define RA*: a bag version of RA
Formal Semantics of SQL: RA*
*p (r):
preserves copies in r
e.g: *city = Brighton (homes) =
cname
Johnson
Johnson
ccity
Brighton
Brighton
2. *A1, …, An (r): no duplicate elimination
e.g:  *cname (homes) =
cname
Johnson
Smith
Johnson
Smith
Formal Semantics of SQL: RA*
r * s :
additive union
A
1
1
2
B
α
α
β
A B
2 β
3 α
1 α
*
r
4. r -* s:
e.g: r -* s =
=
s
A B
1 α
1 α
2 β
2 β
3 α
1 α
bag difference
A B
1 α
s -* r =
A B
3 α
Formal Semantics of SQL: RA*
r * s:
A
1
1
2
cartesian product
B
α
α
β
*
C
+
-
=
A
1
1
1
1
2
2
B
α
α
α
α
β
β
C
+
+
+
-
Formal Semantics of SQL
Query:
SELECT
FROM
WHERE
a1, ….., an
r1, ….., rm
p
Semantics: *A1, …, An (*p (r1  * …  * rm) )
Query:
SELECT DISTINCT
FROM
WHERE
(1)
a1, ….., an
r1, ….., rm
p
Semantics: What is the only operator to change in (1)?
 A1, …, An (*p (r1  * …  * rm) )
(2)
Set/Bag Operations Revisited
Set Operations
UNION
INTERSECT
EXCEPT
≡ U
≡ ∩
≡ -
Bag Operations
UNION ALL
INTERSECT ALL
EXCEPT ALL
≡ U*
≡ ∩*
≡ -*
Duplicate Counting:
Given m copies of t in r, n copies of t in s, how many copies of t in:
r UNION ALL s?
A: m + n
r INTERSECT ALL s?
A: min (m, n)
r EXCEPT ALL s?
A: max (0, m-n)
SQL: Summary
Clause
Eval
Order
SELECT [(DISTINCT)]
FROM
WHERE
INTO
GROUP BY
HAVING
ORDER BY
4
1
2
7
3
5
6
AS
UNION ALL
UNION
8
(similarly intersection,
except)
Semantics (RA/RA*)
 (or *)
*
*

Extended relational operator
g
*
Can’t express: requires ordered
sets, bags
r
U*
U
Next

Database updates
Modification of the Database – Deletion
Delete all account records at the Perryridge branch
delete from account
where branch-name = ‘Perryridge’
Delete all accounts at every branch located in Needham city.
delete from account
where branch-name in (select branch-name
from branch
where branch-city = ‘Needham’)
delete from depositor
where account-number in
(select account-number
from branch, account
where branch-city = ‘Needham’
and branch.branch-name = account.branch-name)
Example Query
Delete the record of all accounts with balances
below the average at the bank.
delete from account
where balance < (select avg (balance)
from account)
Problem: as we delete tuples from deposit, the average balance
changes
Solution used in SQL:
1. First, compute avg balance and find all tuples to delete
2. Next, delete all tuples found above (without recomputing avg or
retesting the tuples)
Modification of the Database – Insertion
Add a new tuple to account
insert into account
values (‘A-9732’, ‘Perryridge’,1200)
or equivalently
insert into account (branch-name, balance, accountnumber)
values (‘Perryridge’, 1200, ‘A-9732’)
Add a new tuple to account with balance set to null
insert into account
values (‘A-777’,‘Perryridge’, null)
Modification of the Database – Updates
Increase all accounts with balances over $10,000 by
6%, all other accounts receive 5%.
Write two update statements:
update account
set balance = balance  1.06
where balance > 10000
update account
set balance = balance  1.05
where balance  10000
The order is important
Can be done better using the case statement
Download