SQL 1 Database Principles Connecting to db2: • Using your userID and password connect to our server. ssh –l userID avalon.cs.newpaltz.edu • Connect to the Library database $ db2 connect to Library $ db2 “set schema = db2inst1” OS process db2 instance Library userID1 userID2 each schema has tables created by the schema owner manages various databases Sample ... each database has one schema per user db2inst1 Cardholder Database Principles Book ... Once Connected: • db2inst1 schema has Library tables already created. • Viewing the tables: select * from Cardholder select * from Book select * from Copy select * from Reserves select * from Borrows select * from Status Database Principles * means “all columns” Two Kinds of Interaction: • Command-line: $ db2 “select * from Cardholder” • Execute a file: – place your query in a file # query1.sql end all commands with a ; select * from Cardholder; – execute the file $ db2 –tf query1.sql • Latter is better: – easier to edit – don’t have to type same command twice. Database Principles Split Personality • SQL is made up of two parts – DDL (Data Definition Language) – DML (Data Manipulation Language) • DDL: – Mostly “create” statements: create table, create index, create view, drop table, … • DML: Made up of 4 commands: insert, delete, update and select Database Principles DDL vs DML: • The nature of DDL commands is that – they are used infrequently, – have syntax that does not vary and – can always be looked up in a book. • The DML commands tend to – be unique to the situation, – contain very complicated logic and – require great effort to become proficient. Database Principles DDL: • I have nothing more to say. • If you want to be good at it, read a book Database Principles DML: • Four commands – insert, delete, update and select – that do the obvious. • • • • Insert: Add new rows to a table Delete: Remove rows from a table Update: Modify columns in certain table rows Select: Retrieve information from database tables. • All four commands share the property that the programming problem they present is the logic of expressing what rows to add, drop, modify or retrieve. Database Principles Basic Syntax (db2): • Insert Statement: Insert into <table name> (<list of columns>) values (<matching list of values>) or Insert into <table name> (<list of columns>) select_statement (see def’n later) • Delete Statement: Delete from <table name> where <condition describing rows to be deleted> Database Principles Basic Syntax: • Update Statement: update <table name> set column name = new value, ... where <condition describing rows to be updated> • Select Statement: Select <list of output columns> from <list of input tables> where <condition describing rows to be retrieved> NOTE 1: Other versions of SQL such as T-SQL allow the update and delete statements to contain a from_clause. db2 does not. NOTE 2: When the <list of output columns> is * this means “all columns in all tables in the from_clause” Database Principles Shared Property and Our Focus: • The shared property that these four commands share is a where_clause condition. • It turns out that learning to write the where_clause condition correctly is what it is all about. • We shall spend our time learning to write where_clause conditions in the context of select statements. Database Principles Procedural vs Declarative Languages: • Most programming languages are procedural; that is the programmer must write down a sequence of instructions that are to be executed when the program is run. • Java, C++, Perl and Php are all procedural languages. • Using a procedural language, the programmer figures out what needs to be done and then writes out an algorithm for how to accomplish the desired task. • Defining what is primarily a design activity. • Defining how is primarily a programming activity. Database Principles SQL is a Declarative Language • With SQL you focus on saying what not how. • This sounds simple. • Easier to say “write a program that X” than to actually write the program. • However, expressing “what” in terms of precise logic takes some training and that is what we will do in this section of the course Database Principles SQL SELECT: • Many of the operations of Relational Algebra – select, project, join, set difference and quotient – are all performed in SQL with the SELECT command. • Instead of putting together a sequence of commands in Relational Algebra we ask one query in SQL. • This one query can contain all the logic of several Relational Algebra queries. Database Principles Hard to be Complete and Concise: • In the book, “The River Runs Through”, the author, Norman MacLean, as a young boy writes a page describing something. • His father says, “That’s fine, now reduce all you have said to one paragraph and don’t leave anything out.” • When the boy returns with the result his father says, “That’s fine, now reduce all you have said to one sentence and don’t leave anything out.” • Being complete and concise at the same time is hard! Database Principles SELECT Syntax: joins and conditions final project Select <select_list> from <table_list> where <condition> input tables • select_list: A list of columns/expressions that make up the output of the query. • table_list: A list of tables that make up the input of the query. • condition: A boolean expression (sometimes very complex) that must be true about rows in the various tables in the table_list in order to produce an output looking like the select_list. Database Principles Select Example 1: Select b_name, b_addr from Cardholder • This query gives a complete list of all cardholder names and addresses. • Things to note: – The select_list acts like the final project in a Relational Algebra query – The from_clause only contains one table – There is no where_clause condition so the where_clause is TRUE for each row of Cardholder. Database Principles How to Imagine the Query being Executed • Imagine there is a row-variable, ch, that visits each row of the table. • Each time it stops at a row the where_clause is evaluated. • Each time the where_clause is TRUE, values are extracted from the row and sent to output. ch Database Principles Select Example 2: Acts like Relational Algebra PROJECT Select b_name, b_addr from Cardholder where b_status = ‘junior’ Acts like Relational Algebra SELECT where_clause false false true false true false false ch ch ch ch ch ch ch Answer b_name b_addr jo-ann john New Paltz Kingston Database Principles NOTE: The condition b_status = ‘junior’ is called a SARG (Search ARGument) Select Example 3: • Find the cardholders (b_name, b_addr) with ch and b are called Reservations on December 1. table aliases Select b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid and l_date = ‘Dec 1’ called a join-term b ch ch b b_name b_addr john susan New Paltz Wallkill Database Principles Imagining a Multi-table Query Select b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid and l_date = ‘Dec 1’ ch b for each row in Cardholder (ch) for each row in Borrows (b) if ( ch.borrowerid = b.borrowerid and b.l_date = ‘Dec 1’) then copy columns in select_list to output Database Principles Alternative Syntax: project join Select b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid and l_date = ‘Dec 1’ select Can also be written NOTE: In terms of Relational Algebra both these queries combine select, project and join Select b_name, b_addr from cardholder ch inner join borrows b on ch.borrowerid = b.borrowerid where l_date = ‘Dec 1’ By this logic the join terms are removed from the where_clause and placed inside the from_clause. Database Principles More Complicated Join Query: • Find the cardholders (b_name, b_addr) who have reserved a book that was borrowed on Dec 1. Include in your answer the book (author, title) involved. Select b_name, b_addr, author, title from cardholder ch, reserves r, book k, copy c, borrows b where ch.borrowerid = r.borrowerid AND r.ISBN = k.ISBN AND k.ISBN = c.ISBN AND c.accession_no = b.accession_no AND b.l_date = ‘Dec 1’ unique answers Select distinct b_name, b_addr, author, title from cardholder ch, reserves r, book k, copy c, borrows b where ch.borrowerid = r.borrowerid AND r.ISBN = k.ISBN AND k.ISBN = c.ISBN AND c.accession_no = b.accession_no AND b.l_date = ‘Dec 1’ Database Principles # cardholder reserved something # that is a book # which has a copy # that was loaned # on Dec 1 Subqueries: • It is possible to nest one query inside another. This is called a subquery. • Subqueries act somewhat like subordinate clauses in an English sentence. Who reserved the books that John from New Paltz borrowed? • The query strategy can be – First find all books borrowed by John from New Paltz – Then find who reserved “those” books. Database Principles Subqueries 2: • Since the results of subqueries are sets then we need special set operators to manipulate them. • The special boolean operators are IN and EXISTS • IN (tests set membership): element IN (select … ) tests if the thing on the left (element) belongs to the set which is the result of the subquery. • EXISTS (tests set existence): EXISTS (select … ) tests if the subquery has an empty result set or not. Database Principles Subquery Example (cont): Who reserved the books that John from New Paltz borrowed? Select distinct b_name, b_addrb_name, b_addr Select distinct from cardholderfrom ch, reserves r ch, reserves r, cardholder where ch.borroweridcopy = r.borrowerid c, borrowsAND b, cardholder ch1 r.ISBN IN where ch.borrowerid = r.borrowerid AND (select c.ISBN r.ISBN = c.ISBN AND from copy c, borrows b, cardholder ch1 c.accession_no = b.accession_no AND where c.accession_nob.borrowerid = b.accession_no AND = ch1.borrowerid AND b.borrowerid = ch1.borrowerid AND AND ch1.b_name = ‘John’ ch1.b_name = ‘John’ AND = ‘New Paltz’ ch1.b_addr ch1.b_addr = ‘New Paltz’) Exercise: Rewrite this query without using a subquery. Database Principles UNION Query: • Find the cardholders (b_name, b_addr) who have borrowed or reserved a book. (select distinct b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid) UNION (select distinct b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid) NOTE: There is a possibility of duplicate rows in the answer for a couple of reasons - a cardholder may reserve or borrow more than one book - a cardholder may reserve one book and borrow another The key word “distinct” makes sure the output of the subqueries are distinct and the key word “UNION” makes sure that duplicate answers across the subqueries are not displayed. To see all duplicates drop the “distinct” key word and replace “UNION” with “UNION ALL” Database Principles Query Exercise: • Rewrite the previous query without using UNION. • Find the cardholders (b_name, b_addr) who have borrowed or reserved a book. select distinct b_name, b_addr from cardholder ch, borrows b, reserves r where ch.borrowerid = b.borrowerid or ch.borrowerid = r.borrowerid) Database Principles INTERSECT Query: • Find the cardholders (b_name, b_addr) who have both borrowed and reserved books (select distinct b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid) INTERSECT (select distinct b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid) Database Principles SET DIFFERENCE Query: • Find the cardholders (b_name, b_addr) who have borrowed a book but not reserved any books (select distinct b_name, b_addr from cardholder ch, borrows b where ch.borrowerid = b.borrowerid) MINUS (select distinct b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid) Database Principles Hard Queries: • Like RA, hard queries signaled by key words: only, all, not, none and so forth. • Like RA, hard queries resolved using techniques other than select, project and join. • Using the MINUS operator as in our previous example, we can implement set differences. • We can also implement set differences using NOT IN. element NOT IN (select …) Database Principles NOT IN Queries: • Advantage of the NOT IN syntax: it reads like English. Find cardholders (b_name, b_addr) who have only reserved books on Dec 1 • Rewrite this same query removing the word “only” and using the phrase “not in” instead. Find cardholders (b_name, b_addr) who have reserved a book and are not in the set of cardholders who have reserved books on a date other than Dec 1 BookReservers Reservers of Books only on Dec 1 Reservers of Books on a date other than Dec 1 Database Principles English-to-SQL Translation: • The latest version of our query translated easily into SQL Find cardholders (b_name, b_addr) who have reserved a book and those cardholders are not in the set of cardholders who have reserved books on a date other than Dec 1 Select b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid AND ch.borrowerid NOT IN (select borrowerid from reserves where r_date != ‘Dec 1’) Exercise: Rewrite this query using the MINUS operator Database Principles Using Minus Find cardholders (b_name, b_addr) who have only reserved books on Dec 1 (select b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid and r_date = “Dec 1”) MINUS (select b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid and r_date != “Dec 1”) Database Principles Observation: • Both the MINUS syntax and the NOT IN syntax allow the programmer to program the query in steps: – First find what you don’t want – Throw it away – You are left with what you want. • Since one of these queries may run faster than the other you will need to know how to write both. • You get paid not only for correctness but probably more important, speed. Database Principles Negation Queries: • Like RA, some negation queries are “easy” in SQL and some are “hard”. • The former use SARGS and join_terms only. • The latter usually need to use NOT IN. • Like RA, the hard queries come from negating the presence on an entity in some relationship where the minimum participation number is 0. Database Principles Hard Negation Query Example 1 • Find the cardholders who have not borrowed anything. • Rewritten as: Find the cardholders who are not in the set of borrowers of books. Select b_name, b_addr from cardholder where borrowerid NOT IN (select borrowerid from borrows) Database Principles Hard Negation Query Example 2 • Find the cardholders who have not borrowed any book they have reserved. • Rewritten as: Find the cardholders who have reserved something and are not in the set of borrowers of books they have also reserved. Select b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid AND ch.borrowerid NOT IN (select b.borrowerid from borrows b, copy c, reserves r where b.borrowerid = r.borrowerid AND b.accession_no = c.accession_no AND c.ISBN = r.ISBN) Database Principles Hard Negation Query Example 3 • This query is called a co-related subquery. • Find the cardholders who have not borrowed the same book they have reserved. • Rewritten as: Find the cardholders who have reserved at least one book and that book not in the set of books they have borrowed. Select b_name, b_addr from cardholder ch, reserves r where ch.borrowerid = r.borrowerid AND r.ISBN NOT IN (select c.ISBN from borrows b, copy c where b.borrowerid = ch.borrowerid AND b.accession_no = c.accession_no) Database Principles Co-related Subquery: • What makes the subquery “co-related” is the presence of a table alias inside the subquery that is defined outside the subquery. • This is very similar to using a global variable inside function definition. • It is easy to pick up because in the English version of the query you will have a definite article (the, those, that) or definite pronoun (they, she, he, it) inside the subquery phrase that refers to something outside the subquery phrase. Database Principles EXISTS: • EXISTS can be used with a subquery to decide if it is empty or not. Find the cardholders who have borrowed a book Find the cardholders where there exists a book loan made by them select b_name, b_addr from cardholder ch where exists (select * from borrows b where b.borrowerid = ch.borrowerid) • The logic is that if the subquery is empty this EXISTS is false so nothing is returned. Database Principles Exercise: • Write the previous query without using EXISTS. select distinct b_name, b_addr from cardholder ch, borrows b where b.borrowerid = ch.borrowerid) Database Principles NOT EXISTS: Find the cardholders who have reserved all AW books • In RA we used the QUOTIENT operator for this query. • In SQL we use NOT EXISTS • Rewrite the query in English with NOT EXISTS Find the cardholders where there does NOT EXIST a book published by Addison-Wesley that they did not reserve. Find the cardholders where there does NOT EXIST a book published by Addison-Wesley and that book is NOT IN the set of books they reserved. Select b_name, b_addr from cardholder ch where NOT EXISTS (select * from book k where k.pub_name = ‘AW’ AND k.ISBN NOT IN (select ISBN from reserves r where r.borrowerid = ch.borrowerid)) Database Principles