SQL 2 – The Sequel CS 186, Spring 2007, Lecture 8 R&G, Chapter 5 Mary Roth The important thing is not to stop questioning. Albert Einstein Life is just a bowl of queries. -Anon (not Forrest Gump) Administrivia • Homework 1 due Tuesday, Feb 13 10 p.m. • Coming up: – Homework 2 handed out Feb 13 – Midterm 1: in class February 22 • Questions? Review: SQL DML • DML includes 4 main statements: SELECT (query), INSERT, UPDATE and DELETE PROJECT • e.g: SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid = E.sid AND JOIN S.age=19 SELECT Example: Find sailors who have reserved a red and a green boat SELECT R.sid FROM Boats B,Reserves R WHERE B.color = ‘red’ AND R.bid=B.bid INTERSECT SELECT R.sid FROM Boats B,Reserves R WHERE B.color = ‘green’ AND R.bid=B.bid sid 1 2 sid 1 = Now let’s do this with a self-join Boats bid bname color 101 Nina red 102 Pinta blue 103 Santa Maria red 105 Titanic green Reserves sid bid day sid 1 101 9/12 1 2 103 9/13 1 105 9/13 Find sids of sailors who’ve reserved a red and a green boat SELECT R1.sid FROM Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE B1.color = ‘red’ AND B1.bid=R1.bid Find red reserved boats Find green reserved boats AND B2.color = ‘green’ AND B2.bid=R2.bid Find matching green and red reserved boats AND R1.sid=R2.sid bid bname color 101 Nina red 102 Pinta blue 103 Santa Maria red 105 Titanic green sid bid day 1 101 9/12 2 103 9/13 1 105 9/13 sid 1 2 sid 1 sid = 1 Nested Queries • The WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses can too) e.g. Find the names of sailors who’ve reserved boat #103: Reserves SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R sname WHERE R.bid=103) Sailors sid sid bid day 1 101 9/12 2 103 9/13 1 105 9/13 Bilbo sname rating age S 1 Frodo 7 22 S 2 Bilbo 2 39 S 3 Sam 8 27 First compute the set of all sailors that have reserved boat 103… …and then check each the sid of each tuple in Sailors to see if it is in S1 sid 2 Nested Queries • This nested query was uncorrelated because the subquery does not refer to anything in the enclosing query SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103) It can evaluated once and then checked for each tuple in enclosing query Nested Queries with correlation • Nested queries can also be correlated; the subquery refers to the enclosing query SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid) The subquery must be re-evaluated for each tuple in enclosing query EXISTS is a set operator that is true if result of set expression has at least one tuple Nested Queries e.g. Find the names of sailors who’ve reserved boat #103: Notice that this query computes the same answer as the previous query! SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid) 1 2 3 Sailors sid Reserves sname rating age S 1 Frodo 7 22 S 2 Bilbo 2 39 S 3 Sam 8 27 sid bid day sid 1 101 9/12 2 2 103 9/13 1 105 9/13 bid day 103 9/13 Set-Comparison Operators • <tuple expression> IN <set expression> • • – True if <tuple> is a member of <set> – Also, NOT IN EXISTS <set expression> – True if <set expression> evaluates to a set with at least one member – Also, NOT EXISTS UNIQUE <set expression> – True if <set expression> evaluates to a set with no duplicates; each row can appear exactly once – Also, NOT UNIQUE • <tuple expression> comparison op ANY <set expression> – True if <set expression> contains at least one member that makes the comparison true – Also, op ALL Use NOT Exists for Division Recall: X/Y means only give me X tuples that have a match in Y. Find sailors who’ve reserved all boats. X = set of sailors and Y = set of all boats with reservations. SELECT S.sname Find Sailors S such that ... FROM Sailors S WHERE NOT EXISTS there is no boat B... (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid without a reservation by FROM Reserves R Sailor S WHERE R.bid=B.bid AND R.sid=S.sid)) Division SELECT S.sname FROM Sailors S WHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R 101 103 WHERE R.bid=B.bid 1 3 AND R.sid=S.sid)) 2 Reserves Sailors sid sid bid day R 1 103 9/12 R R R R 2 103 9/13 3 103 9/14 3 101 9/12 1 103 9/13 Boats sname rating age S 1 Frodo 7 22 S 2 Bilbo 2 39 S 3 Sam 8 27 bid bname color B 101 Nina red B 103 Pinta blue UNIQUE Find the names of sailors who’ve reserved boat #103 exactly once SELECT S.sname FROM Sailors S WHERE UNIQUE (SELECT sid, bid FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid) 1 2 3 Sailors sid Reserves sname rating age S 1 Frodo 7 22 S 2 Bilbo 2 39 S 3 Sam 8 27 sid bid day sid 1 103 9/12 2 2 103 9/13 1 103 9/13 bid day 103 9/13 ANY Find sailors whose rating is greater than that of some sailor called Bilbo: Correlated or uncorrelated? SELECT * FROM Sailors S WHERE S.rating > ANY(SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Bilbo’) S1 Uncorrelated! S2 sid sname rating age sid sname rating age 1 Frodo 7 22 2 1 Bilbo Frodo 2 7 39 22 2 Bilbo 2 39 2 Bilbo 2 39 3 Sam 8 27 3 Sam 8 27 Intermission Aggregate Operators • Very powerful; enables computations over sets of tuples • COUNT: returns a count of tuples in the set • AVG: returns average of column values in the set • SUM: returns sum of column values in the set • MIN, MAX: returns min (max) value of column values in a set. • DISTINCT can be added to COUNT, AVG, SUM to perform computation only over distinct values. SELECT COUNT (*) FROM Sailors S SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10 SELECT AVG(DISTINCT S.age) FROM Sailors S WHERE S.rating=10 Aggregate Operators Find name and age of the oldest sailor(s) SELECT S.sname, MAX (S.age) FROM Sailors S X What will the result be? sname age Frodo 39 Bilbo 39 Sam 39 Sailors sid sname rating age 1 Frodo 7 22 2 Bilbo 2 39 3 Sam 8 27 Not legal syntax; no other columns allowed in SELECT clause without a GROUP BY clause (we’ll learn about those next) Aggregate Operators Find name and age of the oldest sailor(s) Instead: SELECT S.sname, S.age And then find the FROM Sailors S sailors(s) of that age… WHERE S.age = Find the maximum age… (SELECT MAX (S2.age) FROM Sailors S2) GROUP BY and HAVING • So far, we’ve seen aggregate operators applied to all tuples. – What if we want to apply ops to each of several groups of tuples? • Consider: Find the age of the youngest sailor for each rating level. – In general, we don’t know how many rating levels exist, and what the rating values for these levels are! – Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!): For i = 1, 2, ... , 10: SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i Queries With GROUP BY • To generate values for a column based on groups of rows, use aggregate functions in SELECT statements with the GROUP BY clause Returning 1 row per group SELECT [DISTINCT] target-list FROM relation-list [WHERE qualification] GROUP BY grouping-list And finally compute aggregate function over each group… First select these rows… Then group them by the values in these columns… target-list contains: • list of column names from grouping–list • terms with aggregate operations (e.g., MIN (S.age)). Group By Example For each rating, find the age of the youngest sailor with age 18 Sailors SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating 7 22 2 21 8 27 Group 1 Group 2 Group 3 sid sname rating age 1 Frodo 7 22 2 Bilbo 2 39 3 Sam 8 27 4 Pippin 2 21 5 Merry 8 17 1 Frodo 7 22 2 Bilbo 2 39 4 Pippin 2 21 3 Sam 8 27 Find the number of reservations for each red boat. SELECT B.bid, COUNT(*)AS tot_res FROM Boats B, Reserves R Boats WHERE R.bid=B.bid AND bid bname B.color=‘red’ 101 Nina GROUP BY B.bid Pinta Santa Maria red 1 103 101 2 Reserves 103 9/13 3 101 9/14 4 101 9/14 red 102 103 2 color blue sid bid day 1 102 9/12 2 103 9/13 3 101 9/14 4 101 9/14 Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification • Use the HAVING clause with the GROUP BY clause to restrict which group-rows are returned in the result set Find the age of the youngest sailor with age 18, for each rating with at least 2 such sailors SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 rating 1 7 7 2 8 10 age 33.0 45.0 35.0 55.5 35.0 rating 1 7 38 10 m-age 33.0 35.0 55.0 35.0 sid 22 31 71 64 29 58 sname rating age Dustin 7 45.0 lubber 8 55.5 zorba 10 16.0 horatio 7 35.0 brutus 1 33.0 rusty 10 35.0 count 1 2 1 1 rating 7 35.0 Answer Sailors who have reserved all boats Sailors sid sname rating age 1 Frodo 7 22 2 Bilbo 2 39 3 Sam 8 27 SELECT S.name FROM Sailors S, reserves R WHERE S.sid = R.sid GROUP BY S.name, S.sid HAVING COUNT(DISTINCT R.bid) = ( Select COUNT (*) FROM Boats) Boats sname sid Frodo bid bname color 101 Nina red bid 102 Pinta blue 1 102 103 Santa Maria red Bilbo 2 101 Bilbo 2 102 Frodo 1 102 Bilbo 2 103 Reserves sname sid count Frodo 1 1 count sid bid day Bilbo 2 3 3 1 102 9/12 2 102 9/12 sname sid bid 2 101 9/14 Frodo 1 102,102 Bilbo 2 101, 102, 103 1 102 9/10 2 103 9/13 More about Joins SELECT (column_list) FROM table_name [INNER | {LEFT |RIGHT | FULL } OUTER] JOIN table_name ON qualification_list WHERE … Explicit join semantics needed unless it is an INNER join (INNER is default) Default semantics: Inner Join Only rows that match search conditions are returned. SELECT s.sid, s.name, r.bid FROM Sailors s INNER JOIN Reserves r ON s.sid = r.sid Returns only those sailors who have reserved boats SQL-92 also allows: SELECT s.sid, s.name, r.bid FROM Sailors s NATURAL JOIN Reserves r “NATURAL” means equi-join for each pair of attributes with the same name SELECT s.sid, s.name, r.bid FROM Sailors s INNER JOIN Reserves r ON s.sid = r.sid sid 22 31 95 sname rating age Dustin 7 45.0 Lubber 8 55.5 Bob 3 63.5 sid bid day 22 101 10/10/96 95 103 11/12/96 s.sid s.name r.bid 22 Dustin 101 95 Bob 103 Left Outer Join Left Outer Join returns all matched rows, plus all unmatched rows from the table on the left of the join clause (use nulls in fields of non-matching tuples) SELECT s.sid, s.name, r.bid FROM Sailors s LEFT OUTER JOIN Reserves r ON s.sid = r.sid Returns all sailors & information on whether they have reserved boats SELECT s.sid, s.name, r.bid FROM Sailors s LEFT OUTER JOIN Reserves r ON s.sid = r.sid sid 22 31 95 sname rating age Dustin 7 45.0 Lubber 8 55.5 Bob 3 63.5 s.sid 22 95 31 sid bid day 22 101 10/10/96 95 103 11/12/96 s.name r.bid Dustin 101 Bob 103 Lubber Right Outer Join Right Outer Join returns all matched rows, plus all unmatched rows from the table on the right of the join clause SELECT r.sid, b.bid, b.name FROM Reserves r RIGHT OUTER JOIN Boats b ON r.bid = b.bid Returns all boats & information on which ones are reserved. SELECT r.sid, b.bid, b.name FROM Reserves r RIGHT OUTER JOIN Boats b ON r.bid = b.bid sid bid day 22 101 10/10/96 95 103 11/12/96 r.sid bid 101 102 103 104 b.bid 22 95 101 102 103 104 bname Interlake Interlake Clipper Marine b.name Interlake Interlake Clipper Marine color blue red green red Full Outer Join Full Outer Join returns all (matched or unmatched) rows from the tables on both sides of the join clause SELECT r.sid, b.bid, b.name FROM Reserves r FULL OUTER JOIN Boats b ON r.bid = b.bid Returns all boats & all information on reservations SELECT r.sid, b.bid, b.name FROM Reserves r FULL OUTER JOIN Boats b ON r.bid = b.bid sid bid day 22 101 10/10/96 95 103 11/12/96 r.sid bid 101 102 103 104 b.bid 22 95 101 102 103 104 bname Interlake Interlake Clipper Marine color blue red green red b.name Interlake Interlake Clipper Marine Note: in this case it is the same as the ROJ because bid is a foreign key in reserves, so all reservations must have a corresponding tuple in boats. INSERT INSERT [INTO] table_name [(column_list)] VALUES ( value_list) INSERT [INTO] table_name [(column_list)] <select statement> INSERT INTO Boats VALUES ( 105, ‘Clipper’, ‘purple’) INSERT INTO Boats (bid, color) VALUES (99, ‘yellow’) “bulk insert” from one table to another (must be type compatible): INSERT INTO TEMP(bid) SELECT r.bid FROM Reserves R WHERE r.sid = 22; “bulk insert” from files (in Postgres) Copy DELETE & UPDATE DELETE [FROM] table_name [WHERE qualification] DELETE FROM Boats WHERE color = ‘red’ DELETE FROM Boats b WHERE b. bid = (SELECT r.bid FROM Reserves R WHERE r.sid = 22) Can also modify tuples using UPDATE statement. UPDATE Boats SET Color = “green” WHERE bid = 103; Null Values • Values are sometimes – unknown (e.g., a rating has not been assigned or – inapplicable (e.g., no spouse’s name). – SQL provides a special value null for such situations. • The presence of null complicates many issues. E.g.: – Special operators needed to check if value is/is not null. – “rating>8” - true or false when rating is null? What about AND, OR and NOT connectives? – Need a 3-valued logic (true, false and unknown). – Meaning of constructs must be defined carefully. (e.g., WHERE clause eliminates rows that don’t evaluate to true.) – New operators (in particular, outer joins) possible/needed. Null Values – 3 Valued Logic (null > 0) is null (null + 1) is null (null = 0) is null null AND true is null AND T F Null OR T F Null T T F Null T T T T F F F F F T F Null NULL Null F Null NULL T Null Null