IST 601: Database Management SQL I – Data Manipulation (Chapter 6) Denis L. Nkweteyim 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 1 Outline ● Introduction ● Simple Queries ● Sorting Results (ORDER BY Clause) ● Using the SQL Aggregate Functions ● Grouping Results (GROUP BY Clause) ● Subqueries ● ANY and ALL ● Multi-table Queries ● EXISTS and NOT EXISTS ● Combining Result Tables (UNION, INTERSECT, EXCEPT) ● Database Updates 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 2 What is SQL ● Language used for accessing and manipulating databases and the data contained in them ● ● SQL is a declarative language Unlike procedural and oo languages (C, C++, Java, Visual Basic, etc) ● ● ● ● SQL keywords are not case-sensitive ● 16/11/22 You do not specify how the DBMS should go about retrieving data You simply specify (declare) what you want to do in SQL DBMS uses its query optimizer to decide the best way to do it You can use lower-, upper-, or mixed case Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 3 What is SQL ● The ISO SQL standard does not use the formal terms of relations, attributes, and tuples, instead using the terms tables, columns, and rows ● SQL does not adhere strictly to the definition of the relational model ● Examples ● Table produced as the result of the SELECT statement can contain duplicate rows ● SQL imposes an ordering on the columns ● User can order the rows of a result table 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 4 Literals in SQL ● Literals are constants that are used in SQL statements ● Different forms of literals for every data type supported ● 2 common types: Literals that are enclosed in single (or double) quotes and those that are not ● ● All nonnumeric data values must be enclosed in single quotes ● All numeric data values must not be enclosed in single quotes Example: Inserting data into a table INSERT INTO PropertyForRent(propertyNo, street, city, postcode, type, rooms, rent, ­ownerNo, staffNo, branchNo) VALUES (‘PA14', ‘16 Holhead', ‘Aberdeen', ‘AB7 5SU', ‘House', 6, 650.00, ‘CO46', ‘SA9', ‘B007'); ● ● ● The value in column rooms is an integer literal and the value in column rent is a decimal number literal; they are not enclosed in single quotes All other columns are character strings and are enclosed in single quotes Note ● MySQL is forgiving if you enclose numeric literals in quotes 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 5 ● We shall use the Sakila database for our queries 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 6 Install the Sakila Database ● Download the Sakila database archive from the following page ● ● ● https://dev.mysql.com/doc/index-o ther.html ● Follow the instructions below to set up the database ● ● Extract the installation archive to a temporary location Connect to the MySQL server using the mysql command-line client ● 16/11/22 mysql -u root -p Type the following commands at the MySql prompt ● ● mysql> SOURCE <dir>/sakilaschema.sql; mysql> SOURCE <dir>/sakiladata.sql; Confirm that the sample database is installed correctly ● mysql> USE sakila; ● Database changed ● mysql> SHOW FULL TABLES; Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 7 Select ● Most frequently used SQL command ● Purpose ● ● 16/11/22 To retrieve and display data from one or more database tables Capable of performing the equivalent of the relational algebra's Selection, Projection, and Join operations in a single statement Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 8 Select Syntax SELECT [DISTINCT | ALL] {* | [columnExpression [AS newName]] [, . . .]} FROM TableName [alias] [, . . .] [WHERE condition] [GROUP BY columnList] [HAVING condition] [ORDER BY columnList] ● columnExpression ● ● ● TableName ● ● Represents a column name or an expression Two mandatory clauses ● The name of an existing database table or view that you have access to ● SELECT and FROM The others are optional alias ● An optional abbreviation for TableName 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 9 Select Syntax ● Sequence of processing in a SELECT statement ● ● Specifies the table or tables to be used ● Filters the rows subject to some condition Equivalent to select condition in relational algebra ● ● Forms groups of rows with the same column value Filters the groups subject to some condition SELECT ● GROUP BY ● 16/11/22 ● WHERE ● ● HAVING FROM ● ● ● Specifies which columns are to appear in the output Equivalent to projection in relational algebra ORDER BY ● specifies the order of the output Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 10 Simple Select Queries SELECT actor_id, first_name, last_name, last_update from actor; ● List full details from the actor table ● Note ● ● ● No restrictions specified and so the WHERE clause is unnecessary and all columns are required Like above, many SQL retrievals require all columns of a table Quick way to express “all columns” is to use an asterisk (*) SELECT * from actor; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 11 Simple Select Queries SELECT first_name, last_name from actor; from actor; ● 16/11/22 Display just the first and last names from the actor table Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 12 Simple Select Queries SELECT district, city_id FROM address; ● ● Notice that there are several duplicates, Unlike the relational algebra Projection operation, SELECT does not eliminate duplicates when it projects over one or more columns ● SQL is a bag language ● ● 16/11/22 Based on bags instead of set theory Unlike sets, bags allow for duplicates and it is easier to implement a system based on bags than on sets Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 13 Simple Select Queries SELECT distinct district, city_id FROM address; ● 16/11/22 To eliminate the duplicates, we use the DISTINCT keyword Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 14 Simple Select Queries SELECT rental_id, amount FROM payment; ● 16/11/22 List the rental id and amount paid in the payment table Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 15 Simple Select Queries SELECT rental_id, amount*2 FROM payment; ● ● 16/11/22 List the rental id and expected amount for 2 months Notice that the column name for expected amount is not very user friendly – we can rename it Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 16 Simple Select Queries SELECT rental_id, amount*2 AS 'Amount for 2 Months' FROM payment; ● ● 16/11/22 List the rental id and expected amount for 2 months Notice that the column name for expected amount is not very user friendly – we can rename it Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 17 Row Selection (WHERE clause) ● WHERE Clause ● ● ● Used to restrict the rows that are retrieved Consists of the keyword WHERE followed by a search condition that specifies the rows to be retrieved Five basic search conditions (or predicates) ● ● ● ● ● 16/11/22 Comparison: Compare the value of one expression to the value of another expression Range: Test whether the value of an expression falls within a specified range of values Set membership: Test whether the value of an expression equals one of a set of values Pattern match: Test whether a string matches a specified pattern Null: Test whether a column has a null (unknown) value Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 18 Comparison Operators ● ● = equals ● <> is not equal to (ISO standard) ● ! = is not equal to (allowed in some dialects) ● < is less than ● > is greater than ● < = is less than or equal to ● > = is greater than or equal to 16/11/22 More complex predicates can be generated using the logical operators AND, OR, and NOT ● ● ● ● ● Use parentheses (if needed or desired) to show the order of evaluation an expression is evaluated left to right; subexpressions in brackets are evaluated first; NOTs are evaluated before ANDs and ORs; ANDs are evaluated before ORs Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 19 Comparison Search Condition select actor_id, first_name, last_name from actor where first_name = 'Joe'; ● 16/11/22 ID number, first name, and last name of an actor whose first name “Joe” Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 20 Comparison Search Condition SELECT country_id, country FROM country where country_id = 19; SELECT country_id, country FROM country where country_id <= '10'; ● ● 16/11/22 List country id and country name for country whose country id is 19 List country id and country name for countries whose country id is less than or equal to 10 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 21 Comparison Search Condition SELECT address, district, city_id FROM sakila.address where district = 'alberta' or district = 'qld'; ● 16/11/22 List address, district, city id for the districts of Alberta and QLD Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 22 Comparison Search Condition SELECT title, length FROM film where rating = 'pg' and length > 175; ● 16/11/22 Titles and lengths of fils rated pg that are longer than 175 minutes Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 23 Range Search Condition (BETWEEN/NOT BETWEEN) SELECT first_name, last_name FROM actor where first_name between 's' and 'u'; SELECT first_name, last_name FROM actor where first_name not between 'a' and 'w'; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 24 Set Membership Search Condition (IN/NOT IN) SELECT address, district, postal_code FROM address where district in ('Texas', 'California'); 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 25 Set Membership Search Condition (IN/NOT IN) SELECT * FROM language; SELECT language_id, name FROM language where name not in ('English', 'French'); 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 26 Pattern Matching ● ● ● WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches General form – <Attribute> LIKE <pattern> or – <Attribute> NOT LIKE <pattern> Pattern is a quoted string with – % wildcard ● – _ wildcard ● – Represents any sequence of zero or more characters, i.e., any string Represents any single character All other characters in the pattern represent themselves 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 27 Pattern Matching Examples ● LIKE 'H%' – ● LIKE 'H_ _ _' – ● A sequence of characters of any length containing Buea NOT LIKE 'H%' – ● Any sequence of characters, of length at least 1, with the last character an e LIKE '%Buea%' – ● There must be exactly four characters in the string, the first of which must be an H LIKE '%e' – ● The first character must be H, but the rest of the string can be anything The first character cannot be an H If the search string can include the pattern-matching character itself, we can use an escape character to represent the pattern-matching character – To check for the string ‘15%’, we can use the predicate: LIKE ‘15#%’ ESCAPE ‘#’ 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 28 Pattern Matching Examples select first_name, last_name from actor where last_name like '%LI%'; ● 16/11/22 Actors whose last names contain the letters LI Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 29 Pattern Matching Examples select first_name, last_name from actor where last_name like '%LI%'; ● 16/11/22 Address, district, and phone numbers for all phone numbers that begin with 99 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 30 Pattern Matching Examples SELECT address, district, phone FROM address where phone like '_90%'; ● 16/11/22 Address, district, and phone numbers for all phone numbers with 90 as the second and third digits Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 31 Null Values ● Tuples in SQL relations can have NULL as a value for one or more components ● Meaning of NULL varies and depends on context ● Two common cases – – 16/11/22 Use Null for missing data: e.g., we know that an email address exists, but we do not know what it is Inapplicable: e.g., the value of attribute spouse for an unmarried person Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 32 Comparing Null Values ● ● ● ● The logic of conditions in SQL is NOT the well known 2valued logic: TRUE or FALSE But rather a 3-valued logic: TRUE, FALSE, UNKNOWN When any value is compared with NULL, the truth value is UNKNOWN But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN) 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 33 Comparing Null Values ● ● To understand how AND, OR, and NOT work in 3-valued logic, think of – TRUE = 1, FALSE = 0, and UNKNOWN = ½. – AND = MIN; OR = MAX, NOT(x) = 1-x. Example 16/11/22 TRUE AND (FALSE OR NOT(UNKNOWN)) = MIN(1, MAX(0, (1 - ½ ))) = MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½. Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 34 Three-valued Logic May Lead to Surprises ● Why? – Some common laws, like the commutativity of AND, hold in 3-valued logic. – But others do not – Example: the “law of excluded middle,” ● – 16/11/22 p OR NOT p = TRUE (in two-valued logic) In three-valued logic, when p = UNKNOWN, the left side is MAX( ½, (1 – ½ )) = ½ (i.e., UNKNOWN) Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 35 NULL Search Condition (IS NULL/IS NOT NULL) SELECT first_name, last_name, username, password FROM staff; SELECT first_name, last_name, username, password FROM staff where password is null; SELECT first_name, last_name, username, password FROM staff where password is not null; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 36 Sorting Results (ORDER BY Clause) ● ● ● ● The rows of an SQL query result table are not sorted ORDER BY clause in the SELECT statement can be used to sort ORDER BY clause consists of a list of column identifiers that the result is to be sorted on, separated by commas Column identifier may be either A column name or ● A column number that identifies an element of the SELECT list by its position within the list, 1 being the first (leftmost) element in the list, 2 the second element in the list, and so on ● ● Sorting order ● ● Column numbers could be used if the column to be sorted on is an expression and no AS (i.e., Rename) clause is specified to assign the column a name Ascending (ASC) or descending (DESC) order on any column or combination of columns The ORDER BY clause must always be the last clause of the SELECT statement. 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 37 Sorting Examples SELECT * FROM category order by name asc; SELECT * FROM category order by name desc; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 38 Sorting Examples SELECT address, district FROM address where district = 'california' order by district asc, address asc; select first_name, last_name from actor where last_name like '%LI%' order by first_name, last_name; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 39 Using the SQL Aggregate Functions ● ● We often want to perform some form of summation or aggregation of data, similar to the totals at the bottom of a report 5 aggregation functions ● COUNT – returns the number of values in a specified column ● SUM – returns the sum of the values in a specified column ● AVG – returns the average of the values in a specified column ● MIN – returns the smallest value in a specified column ● MAX – returns the largest value in a specified column 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 40 Using the SQL Aggregate Functions ● Aggregate functions ● Operate on a single column of a table and return a single value ● COUNT, MIN, and MAX apply to both numeric and nonnumeric fields ● SUM and AVG may be used on numeric fields only ● ● ● ● 16/11/22 Apart from COUNT(*), each function eliminates nulls first and operates only on the remaining nonnull values COUNT(*) is a special use of COUNT that counts all the rows of a table, regardless of whether nulls or duplicate values occur To eliminate duplicates before the function is applied, use the keyword DISTINCT before the column name in the function Can explicitly specigy the keyword ALL if we do not want to eliminate duplicates, although ALL is assumed if nothing is specified Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 41 Using the SQL Aggregate Functions ● DISTINCT ● ● 16/11/22 Has no effect with the MIN and MAX functions. However, it may have an effect on the result of SUM or AVG, so consideration must be given to whether duplicates should be included or excluded in the computation Can be specified only once in a query Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 42 Using the SQL Aggregate Functions ● Some notes ● ● ● An aggregate function can be used only in the SELECT list and in the HAVING clause If the SELECT list includes an aggregate function and no GROUP BY clause is being used to group data together, then no item in the SELECT list can include any reference to a column unless that column is the argument to an aggregate function Example, the following query is illegal: SELECT staffNo, COUNT(salary) FROM Staff; The query does not have a GROUP BY clause and the column staffNo in the SELECT list is used outside an aggregate function 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 43 Aggregate Functions Examples SELECT count(*) FROM actor; select first_name, last_name from actor where last_name like '%LI%' SELECT count(*) FROM actor where last_name like '%LI%'; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 44 Aggregate Functions Examples select count(*) category_id FROM sakila.film_category; select count(distinct category_id) FROM film_category; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 45 Aggregate Functions Examples SELECT count(payment_id) as 'No of Payments', sum(amount) as Total FROM payment; SELECT min(amount) as 'Minimum Payment', max(amount) as 'Maximum Payment', avg(amount) as 'Average Payment' from payment; ● ● 16/11/22 Number of payments and sum of payment amounts Minimum, maximum and average payment Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 46 Grouping Results (GROUP BY Clause) ● The aggregate functions as we have seen generaterate a single row of summary data ● However, it is often useful to have subtotals in reports ● The GROUP BY clause of the SELECT statement can be used for this ● ● ● ● A grouped query groups the data from the SELECT table(s) and produces a single summary row for each group The columns named in the GROUP BY clause are called the grouping columns When GROUP BY is used, each item in the SELECT list must be single-valued per group The SELECT clause may contain only ● column names, aggregate functions, constants, an expression involving combinations of these elements 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 47 Grouping Results (GROUP BY Clause) ● ● All column names in the SELECT list must appear in the GROUP BY clause unless the name is used only in an aggregate function The contrary is not true ● ● ● There may be column names in the GROUP BY clause that do not appear in the SELECT list. When the WHERE clause is used with GROUP BY ● The WHERE clause is applied first, ● Then groups are formed from the remaining rows that satisfy the search condition Two nulls are considered to be equal for purposes of the GROUP BY clause ● 16/11/22 If two rows have nulls in the same grouping columns and identical values in all the nonnull grouping columns, they are combined into the same group Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 48 GROUP BY Examples SELECT category_id, count(film_id) FROM film_category group by category_id; ● 16/11/22 Film categories and the number of films in each category Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 49 GROUP BY Examples SELECT customer_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id; SELECT customer_id, staff_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id,staff_id; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 50 Restricting Groupings (HAVING clause) ● Having Clause ● ● Designed for use with the GROUP BY clause to restrict the groups that appear in the final result table Similar in syntax to WHERE, but serves a different purpose ● ● ● ● Column names used in the HAVING clause must also appear in the GROUP BY list or be contained within an aggregate function In practice, the search condition in the HAVING clause always includes at least one aggregate function; otherwise the search condition could be moved to the WHERE clause and applied to individual rows ● ● WHERE clause filters individual rows going into the final result table HAVING filters groups going into the final result table Remember that aggregate functions cannot be used in the WHERE clause Note ● ● 16/11/22 HAVING clause not a requirement for SQL Any query expressed using a HAVING clause can always be rewritten without the HAVING clause Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 51 HAVING Examples SELECT customer_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id having count(customer_id)>22; SELECT customer_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id having sum(amount) < 130; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 52 HAVING Examples SELECT customer_id, staff_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id,staff_id; SELECT customer_id, staff_id, count(customer_id), sum(amount) FROM payment where customer_id < 6 group by customer_id,staff_id HAVING sum(amount) > 55; 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 53 Subqueries (Nested Queries) ● A subquery is – – – ● A parenthesized SELECT statement embedded within another SELECT statement Can be used as a value in a number of places, including FROM, WHERE, HAVING clauses Results of this inner SELECT statement (or subselect) are used in the outer statement to help determine the contents of the final result Subselects may also appear in INSERT, UPDATE, and DELETE statements 16/11/22 Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 54 Subqueries (Nested Queries) ● 3 types of subquery – Scalar subquery ● ● – Row subquery ● ● – Returns a single column and a single row, that is, a single value Can be used whenever a single value is needed Returns multiple columns, but only a single row Can be used whenever a row value constructor is needed, typically in predicates Table subquery ● Returns one or more columns and multiple rows – 16/11/22 Can be used whenever a table is needed, for example, as an operand for the IN predicate Denis L. Nkweteyim@UB - IST601 (Database Management - SQL) 55