CP 363 – Week 3 Joins – Another form of joins Vendors(vendor_id, vendor_name, vendor_address1, ...) Invoices(invoice_id, vendor_id, invoice_number, invoice_date, invoice_total, payment_total ...) Example: o SELECT v.vendor_name, i.invoice_number, i.invoice_date, i.invoice_total, i.payment_total FROM vendors as v, invoices as i WHERE v.vendor_id = i.vendor_id AND i.invoice_total >= 0 ORDER BY v.vendor_name, i.invoice_total DESC; NOTE: Remember There are two types of inner joins Readability of the code is an important aspect of this language for editing purporses and clarity of code : INSERT The INSERT INTO statement is used to insert new records in a table. 3 different syntax INSERT INTO table_name (column1, column2, column3, ...) VALUES (value1, value2, value3, ...); INSERT INTO table_name VALUES (value1, value2, value3, ...); INSERT INTO table_name SET column1=value1, column2=value2, ... columnK=valueK; o Be careful about constraints on data type, NOT NULL Examples: o INSERT INTO vendors VALUES (30, ‘NEC’, ‘200 toku-cho’, ‘Kyoto’, ‘Kyoto’, ‘16-2600’); o INSERT INTO invoices SET invoice_id=34, vendor_id=31; UPDATE UPDATE statement is used to modify the existing records in a table o UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition; Examples: An UPDATE statement that changes one value in one row o UPDATE invoices SET credit_total = 35.89 WHERE invoice_number = ‘H40000’ and vendor_id=4; An UPDATE statement that changes one value in multiple rows (note: term_id is not a key) o UPDATE invoices SET invoice_due_date = invoice_due_date + 10 WHERE vendor_id = 4; DELETE DELETE statement is used to delete existing records in a table o DELETE FROM table_name WHERE condition; Examples o DELETE FROM invoices WHERE invoice_number = ‘P400000’; o DELETE FROM invoices WHERE invoice_total - payment_total - credit_total = 0; Aggregate functions Aggregate function is used to accumulate information from multiple tuples, forming a single-tuple summary Five basic aggregate functions in SQL o count() is to count the number of rows that matches a specified criterion; o sum() calculates the arithmetic operation (+, -, x, /) of values in a set; o avg() is to calculate the average value of an expression; o max() returns the largest value of a selected column; o min() returns the smallest vlaue of a selected column Built-in aggregate functions: COUNT, SUM, MAX, MIN, and AVG COUNT() COUNT() is to count the number of rows that matches a specified criterion. o COUNT(expression) Example: o SELECT count(invoice_number) FROM invoices WHERE invoice_total >= 300; SUM() SUM function calculates the arithmetic operation (+, -, x, /) of values in a set. o SUM( expression ) The expression could be summation, subtraction, multiplication, or division or a combination of them Examples: o SELECT SUM(invoice_total – payment_total) FROM invoices; o SELECT SUM(Quantity* UnitPrice) FROM order_details; AVG() AVG() function is to calculate the average value of an expression. o AVG( expression ) The expression could be summation, subtraction, multiplication, or division or a combination of them Examples: SELECT AVG(invoice_total) AS AverageInvoice FROM invoices; ; SELECT AVG(Quantity* UnitPrice) AS Average FROM order_details;\ MIN() and MAX() MIN() function returns the smallest value of a selected column. o MIN(column_name) MAX() function returns the largest value of a selected column. o MAX(attribute_name) Examples: o SELECT MIN(invoice_total) AS lowest_invoice_total, MAX(invoice_total) AS o highest_invoice_total, COUNT(*) AS number_of_invoices FROM invoices; GROUP BY GROUP BY keyword is used in collaboration with the SELECT statement to arrange identical data into groups. The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns. o SELECT column_name(s) FROM table_name WHERE condition GROUP BY column_name(s) Examples: o SELECT product, sum(quantity) AS TotalSales FROM purchase WHERE price > 1 GROUP BY product; o SELECT product, sum(quantity) AS TotalSales FROM products WHERE price > 100 GROUP BY product; DISTINCT DISTINCT is to eliminate duplicate rows in a result set. It can combine with SELECT statement or aggregate functions. Examples: o SELECT DISTINCT product from products; <- with DISTINCT o SELECT product from products; <- without DISTINCT Self join A self join is joining a table to itself. When a relation occurs twice in the FROM clause we call it a self-join. o SELECT column_name(s) o FROM table1 T1, table1 T2 o WHERE condition; o (Note: T1 and T2 are different table aliases for the same table.) NOTE: o It is a special case of join o Note: there is no SELF JOIN key word Sceneraio when self join is used o Table Employee may have a Supervisor information columns (ID, name) that is associated with the employee. To query the data and get information for both people in one row, self join can be used as follows. o select e1.EmployeeID, o e1.FirstName, o e1.LastName, o e1.SupervisorID, o e2.FirstName as SupervisorFirstName, o e2.LastName as SupervisorLastName o from Employee e1 join Employee e2 on e1.state = e2.state CROSS JOIN CROSS JOIN returns all combination records from both tables. Cross join selects the all the rows from the first table and all the rows from second table and shows as Cartesian product with all possibilities. o SELECT column_name(s) FROM table1 CROSS JOIN table2; SELECT i.invoice_id, i.invoice_total, i.invoice_number, v.vendor_name FROM invoices as I CROSS JOIN vendors as v; When should CROSS JOIN be used? CROSS JOIN vs INNER JOIN Inner join o SELECT T.t_id,T.prof_name,S.student_name FROM Teacher T INNER JOIN Student S ON T.t_id = S.prof_id; Cross join o SELECT T.t_id, T.prof_name, S.student_name FROM Teacher T CROSS JOIN Student S; Aggregate and JOIN An aggregate function in SQL performs a calculation on a set of values, and returns a single value. A JOIN clause is used to combine rows from two or more tables, based on a related column between them. Projections (SELECT * / SELECT c1, c2, ...) projection is defined as taking a vertical subset from the columns of a single table that retains the unique rows. Selections (aka filtering) (WHERE conditions) selection is defined as taking the horizontal subset of rows of a single table that satisfies a particular condition. DDL ( DATA DEFINITION LANGUAGE) Write commands Create, alter, drop... Read commands Describe DML ( DATA MANIPULATION LANGUAGE) Write commands Insert, update, delete... Read commands Select, show HAVING Clause Having clause specifies a search condition for a group or an aggregate. HAVING clause can be used only with the SELECT statement. Because aggregate functions and aliases cannot be used with WHERE keyword, the HAVING clause was added to SQL. Example: o Find products and total value from table products, the minimum price is larger than 1000 and the sales quantity is larger than 40 sales. o Products(product, price, quantity) SELECT product, sum(price*quantity) FROM products WHERE price > 1000 GROUP BY product HAVING sum(quantity) > 40; WHERE vs HAVING WHERE condition is applied to individual rows o The rows may or may not contribute to the aggregate o No aggregate functions or aliases are allowed o WHERE is used to select data in the original tables being processed. HAVING condition is applied to the entire group o HAVING is used to filter data in the result set that was produced by the query. This means it can reference aggregate values and aliases in the SELECT clause. o May use aggregate functions or aliases Examples: o ON WEEK 3 SLIDE 2 PAGE 16 Subqueries Introduction A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A subquery may occur in : o A SELECT/INSERT/UPDATE/DELETE clause o A FROM clause o A WHERE clause Subqueries in Select Product (pname, price, cid) Company (cid, cname, city) Problem: For each product, find the city where it is manufactured. Whenever possible, avoid using a nested queries.