Structured Query Language Lecture 27 STAT 598 W Outline Introduction to SQL & MySQL Single table Queries – Using computed columns – Using special operators: LIKE, IN, BETWEEN – Columns with NULL values – Sorting data – Using group functions – The GROUP BY clause Structured Query Language (SQL) Mid-1970s: – SQL was developed at IBM under the name SEQUEL 1980: – Renamed as SQL to avoid confusion with an unrelated hardware product called SEQUEL Most relational DBMSes use some version of SQL SQL (cont.) Is an English-like language Communicates with an SQL Server Manipulates data and table definitions in the database Supports operations of Relational Algebra SQL Statements Data Retrieval SELECT Data Definition Language (DDL) CREATE, ALTER, DROP, RENAME, TRUNCATE Data Manipulation Language (DML) INSERT, UPDATE, DELETE Data Control Language (DCL) GRANT, REVOKE Transaction control COMMIT, ROLLBACK, SAVEPOINT SQL Statements (cont.) SQL statements are free format SQL statements can be placed on one or more lines Statements are entered in SQL Buffer Keywords cannot be abbreviated or split across lines Clauses are usually placed one per line Indentations are used to improve readability Functions are utilized to perform data manipulation as well as formatting output of a query The end of a statement is indicated by a semicolon MySQL Database Universe GUI Tools MySQL Administrator Command Line Tools mysqldmin MySQL Workbench mysql mysqldump MySQL Clients and Tools mysqld MySQL Server MySQL APIs Connector/J Connector/PHP Connector/ODBC Connector/C Connector/C++ Connector/Net MySQL Command Line Tools SELECT UPDATE . . . BACKUP RESTORE CHECK . . . mysql SHUTDOWN mysqladmin BACKUP mysqldump mysqld Server Process DB Essential mysql Commands mysql> SHOW databases; to show available databases mysql> CREATE DATABASE premiere; to create new database mysql> USE premiere; to start using Premiere database mysql> SHOW tables; to show available tables in default db mysql> SOURCE c:\premiere.txt to run a script file mysql> DESCRIBE customer; to show structure of customer table mysql> EXIT to exit the mysql client Help in MySQL Type “help” at mysql> prompt, or Type “help” followed by name of a statement e.g.: – help select – help union Also available: – Reference Manual: on-line or pdf version mysql> help For information about MySQL products and services, visit: http://www.mysql.com/ For developer information, including the MySQL Reference Manual, visit: http://dev.mysql.com/ To buy MySQL Enterprise support, training, or other products, visit: https://shop.mysql.com/ List of all MySQL commands: Note that all text commands must be first on line and end with ';' ? (\?) Synonym for 'help'. clear (\c) Clear the current input statement. connect (\r) Reconnect to the server. Optional arguments are db and host. delimiter (\d) Set statement delimiter. ego (\G) Send command to mysql server, display result vertically. exit (\q) Exit mysql. Same as quit. go (\g) Send command to mysql server. help (\h) Display this help. notee (\t) Don't write into outfile. print (\p) Print current command. prompt (\R) Change your mysql prompt. quit (\q) Quit mysql. rehash (\#) Rebuild completion hash. source (\.) Execute an SQL script file. Takes a file name as an argument. status (\s) Get status information from the server. tee (\T) Set outfile [to_outfile]. Append everything into given outfile. use (\u) Use another database. Takes database name as argument. charset (\C) Switch to another charset. Might be needed for processing binlog with multi-byte charsets. warnings (\W) Show warnings after every statement. nowarning (\w) Don't show warnings after every statement. SQL Editor SQL server is built in most computers, but in some cases only administrator has full access to it. In order to practice your SQL editing skills, you may download some “SQL Editor” online for free. To make it even simpler, you can directly use some kind of SQL online-editor, such as “SQL Fiddle”. Create a simple table CREATE TABLE Contacts ( id int auto_increment primary key, type varchar(20), details varchar(80) ); INSERT INTO Contacts (type, details) VALUES ('Email', 'wang913@purdue.edu'), ('Website', 'www.stat.purdue.edu/~wang913'), ('Address', 'Purdue University'), ('Phone', '765-714-4263'); +--------------+-------------+---------------------------------+ | ID | TYPE | DETAILS | +--------------+-------------+---------------------------------+ | 1 | Email | wang913@purdue.edu | | 2 | Website | www.stat.purdue.edu/~wang913 | | 3 | Address | Purdue University | | 4 | Phone | 765-714-4263 | +--------------+----------------------------+------------------+ Insert from a data file Use “BULK INSERT”: BULK INSERT MyTable FROM 'c:\data.csv' WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' ) Premiere Products Model REP (1, 1) represents (1, N) CUSTOMER (1, 1) places (1, N) PART included in (1, 1) (0, N) ORDER_LINE ORDER has (1, N) (1, 1) REP (Rep_num, Last_Name, First_Name, Street, City, State, Zip, Commission, Rate) CUSTOMER (Customer_num, Customer_Name, Street, City, State, Zip, Balance, Credit_limit, Rep_num*) ORDER (Order_num, Order_date, Customer_num*) ORDER_LINE (Order_num*, Part_num*, Num_ordered, Quoted_price) PART (Part_num, Description, Warehouse, Class, Price, On_hand) Source: “A Guide to MySQL” by Philip J. Pratt and Mary Z. Last , Course Technology, 2006 Existing Tables in a Default DB To find out what tables exist in the default database, use the SHOW command: mysql> show tables; +--------------------+ | Tables_in_premiere | +--------------------+ | customer | | order_line | | orders | | part | | rep | +--------------------+ 5 rows in set (0.00 sec) Displaying a Table Structure The DESCRIBE command: mysql> desc customer; +---------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------------+--------------+------+-----+---------+-------+ | CUSTOMER_NUM | char(3) | NO | PRI | NULL | | | CUSTOMER_NAME | char(35) | NO | | NULL | | | STREET | char(15) | YES | | NULL | | | CITY | char(15) | YES | | NULL | | | STATE | char(2) | YES | | NULL | | | ZIP | char(5) | YES | | NULL | | | BALANCE | decimal(8,2) | YES | | NULL | | | CREDIT_LIMIT | decimal(8,2) | YES | | NULL | | | REP_NUM | char(2) | YES | | NULL | | +---------------+--------------+------+-----+---------+-------+ 9 rows in set (0.01 sec) SELECT Statement SELECT column(s) FROM table(s) WHERE row condition GROUP BY column(s) HAVING group condition ORDER BY column(s) LIMIT m, n; WHERE Clause Find the number, name, balance, and credit limit for each customer with balance that exceeds the credit limit. mysql> SELECT customer_num,customer_name, balance, credit_limit -> FROM customer -> WHERE balance > credit_limit; +--------------+---------------------+---------+--------------+ | customer_num | customer_name | balance | credit_limit | +--------------+---------------------+---------+--------------+ | 408 | The Everything Shop | 5285.25 | 5000.00 | | 842 | All Season | 8221.00 | 7500.00 | +--------------+---------------------+---------+--------------+ 2 rows in set (0.02 sec) Compound Condition List the description of every part that is not in warehouse number 3 and that has more than 20 units on hand. mysql> SELECT description -> FROM part -> WHERE warehouse <> '3' -> AND on_hand > 20; +----------------+ | description | +----------------+ | Home Gym | | Microwave Oven | +----------------+ 2 rows in set (0.05 sec) Expressions Find the number, name, and available credit for each customer with at least $5,000 of available credit. mysql> SELECT customer_num, customer_name, -> (credit_limit - balance) as "Available Credit" -> FROM customer -> WHERE (credit_limit - balance) >= 5000; +--------------+----------------------------+------------------+ | customer_num | customer_name | Available Credit | +--------------+----------------------------+------------------+ | 282 | Brookings Direct | 9568.50 | | 462 | Bargains Galore | 6588.00 | | 608 | Johnson's Department Store | 7894.00 | | 725 | Deerfield's Four Seasons | 7252.00 | +--------------+----------------------------+------------------+ 4 rows in set (0.00 sec) BETWEEN operator BETWEEN operator makes certain SELECT statements simpler List customer number, name, and balance for customers with their balance between $2,000 and $5,000. mysql> SELECT customer_num, customer_name, balance -> FROM customer -> WHERE balance BETWEEN 2000 AND 5000; +--------------+----------------------------+---------+ | customer_num | customer_name | balance | +--------------+----------------------------+---------+ | 462 | Bargains Galore | 3412.00 | | 608 | Johnson's Department Store | 2106.00 | | 687 | Lee's Sport and Appliance | 2851.00 | +--------------+----------------------------+---------+ 3 rows in set (0.00 sec) LIKE operator LIKE operator is used when exact character type matches are not applicable LIKE is used with wildcard searches % (percent) – matches any string of zero or more characters _ (underscore) – matches any individual character The ESCAPE option can be used to define escape character symbol LIKE operator (cont.) List customer number, name, and complete address of each customer with a street name that contains “Central”. mysql> SELECT customer_num, customer_name, street, city, state, zip -> FROM customer -> WHERE street LIKE '%Central%'; +--------------+-----------------+--------------+-------+-------+-------+ | customer_num | customer_name | street | city | state | zip | +--------------+-----------------+--------------+-------+-------+-------+ | 462 | Bargains Galore | 3829 Central | Grove | FL | 33321 | +--------------+-----------------+--------------+-------+-------+-------+ 1 row in set (0.00 sec) LIKE operator (cont.) You have a difficulty reading a report because someone spilled coffee on it. You can only tell the first digit (‘4’) of the customer# and the last digit (‘8’). The second digit is hard to read. Can you find the customer name and complete address? mysql> SELECT customer_num, customer_name, street, city, state, zip -> FROM customer -> WHERE customer_num LIKE '4_8'; +--------------+---------------------+------------+---------+-------+-------+ | customer_num | customer_name | street | city | state | zip | +--------------+---------------------+------------+---------+-------+-------+ | 408 | The Everything Shop | 1828 Raven | Crystal | FL | 33503 | +--------------+---------------------+------------+---------+-------+-------+ 1 row in set (0.03 sec) IN operator The IN operator provides a concise way to test for values in a specified set. List the customer number, name, and credit limit for each customer with a credit limit of $5,000, $10,000, or $15,000. mysql> SELECT customer_num, customer_name, credit_limit -> FROM customer -> WHERE credit_limit IN (5000, 10000, 15000); +--------------+----------------------------+--------------+ | customer_num | customer_name | credit_limit | +--------------+----------------------------+--------------+ | 282 | Brookings Direct | 10000.00 | | 408 | The Everything Shop | 5000.00 | | 462 | Bargains Galore | 10000.00 | | 524 | Kline's | 15000.00 | | 608 | Johnson's Department Store | 10000.00 | | 687 | Lee's Sport and Appliance | 5000.00 | +--------------+----------------------------+--------------+ 6 rows in set (0.00 sec) Null Values Occasionally, when you enter a new row into a table or modify an existing row, the values for one or more columns are unknown or unavailable e.g., A sales representative is not assigned to a customer This special value is called a null data value, or null. The null is not the same as zero or blank space. Three Valued Logic Any comparison with null returns unknown value e.g. 15 > null, null = null, column < null, column = null Result of WHERE clause predicate is treated as false if it evaluates to unknown Three Valued Logic (cont.) AND TRUE TRUE TRUE FALSE FALSE UNKNOWN UNKNOWN OR TRUE FALSE UNKNOWN TRUE TRUE TRUE TRUE NOT (not unknown) FALSE FALSE FALSE FALSE UNKNOWN UNKNOWN FALSE UNKNOWN FALSE UNKNOWN TRUE TRUE FALSE UNKNOWN UNKNOWN UNKNOWN evaluates to unknown Selecting rows with NULL values Do we have a complete address for each customer? List the number and name of each customer with an unknown/missing street information. mysql> SELECT customer_num, customer_name -> FROM customer -> WHERE street IS NULL; Empty set (0.00 sec) Rules of Precedence The rules determine the order in which expressions are evaluated The default order: 1. 2. 3. 4. 5. 6. 7. Parenthesis Arithmetic operators Comparison conditions, IS, LIKE, IN BETWEEN, CASE NOT logical condition AND logical condition OR logical condition This order can be modified by using parentheses Sorting Typically rows are displayed in the order in which they were inserted The ORDER BY clause can be used to list data in a desired order The column(s) on which data is to be sorted is called a sort key(s) – The sort keys are listed in the order of importance To sort in descending order use the DESC operator (default is ASC) Sorting List the customer number, name, and balance of each customer. Order the output in ascending (increasing) order of balance. mysql> SELECT customer_num, customer_name, balance -> FROM customer -> ORDER BY balance -> LIMIT 5; +--------------+----------------------------+----------+ | customer_num | customer_name | balance | +--------------+----------------------------+----------+ | 725 | Deerfield's Four Seasons | 248.00 | | 282 | Brookings Direct | 431.50 | | 608 | Johnson's Department Store | 2106.00 | | 687 | Lee's Sport and Appliance | 2851.00 | | 462 | Bargains Galore | 3412.00 | +--------------+----------------------------+----------+ 5 rows in set (0.00 sec) Sorting with multiple keys List the customer number, name, and credit limit of every customer, ordered by credit limit in descending order and by name within credit limit. mysql> SELECT customer_num, customer_name cname, credit_limit -> FROM customer -> ORDER BY credit_limit DESC, cname -> LIMIT 5; +--------------+----------------------------+--------------+ | customer_num | cname | credit_limit | +--------------+----------------------------+--------------+ | 524 | Kline's | 15000.00 | | 462 | Bargains Galore | 10000.00 | | 282 | Brookings Direct | 10000.00 | | 608 | Johnson's Department Store | 10000.00 | | 148 | Al's Appliance and Sport | 7500.00 | +--------------+----------------------------+--------------+ 5 rows in set (0.00 sec) Group Functions SUM – Sum of values in a column AVG – Average value in a column COUNT – Number of values in a column MAX – Maximum value in a column MIN – Minimum value in a column STDDEV – Standard Deviation of values in a column VARIANCE – Variance of values in a column Group Functions (cont.) They operate on a set of values as input and give one value as a result COUNT, MAX and MIN functions can be used with any data type SUM, AVG, STDDEV, and VARIANCE can be used only with numeric data types All group functions ignore null values except COUNT(*) Counting rows in a table How many parts are in item class HW? mysql> SELECT COUNT(*) -> FROM part -> WHERE class = 'HW'; +----------+ | COUNT(*) | +----------+ | 3 | +----------+ 1 row in set (0.00 sec) SUM function Find the total number of customers and the total of their balances. mysql> SELECT COUNT(*) "Number of Customers", -> SUM(balance) "Total Balance" -> FROM customer; +---------------------+---------------+ | Number of Customers | Total Balance | +---------------------+---------------+ | 10 | 47651.75 | +---------------------+---------------+ 1 row in set (0.00 sec) Summary statistics Provide summary statistics of customer balance. mysql> SELECT COUNT(balance) N, AVG(balance) Xbar, -> MIN(balance) Min, MAX(balance) Max, -> STD(balance) S -> FROM customer; +----+-------------+--------+----------+-------------+ | N | Xbar | Min | Max | S | +----+-------------+--------+----------+-------------+ | 10 | 4765.175000 | 248.00 | 12762.00 | 3635.106972 | +----+-------------+--------+----------+-------------+ 1 row in set (0.00 sec) MIN function with character type Alphabetically, what is the first and the last part description in the PART Table. mysql> SELECT MIN(description) First, -> MAX(description) Last -> FROM part; +----------------+--------+ | First | Last | +----------------+--------+ | Cordless Drill | Washer | +----------------+--------+ 1 row in set (0.00 sec) DISTINCT operator To avoid duplicates, either when listing or counting values, precede the column name with the DISTINCT operator DISTINCT operator is not a function Useful when used within COUNT function Results with repeated rows Find the customer number of each customer that currently has an open order (i.e., an order in the ORDERS table). mysql> SELECT customer_num -> FROM orders; +--------------+ | customer_num | +--------------+ | 148 | | 356 | | 408 | | 282 | | 608 | | 148 | | 608 | +--------------+ 7 rows in set (0.03 sec) Results without repeated rows Find the customer number of each customer that currently has an open order. List each customer only once. mysql> SELECT DISTINCT customer_num -> FROM orders; +--------------+ | customer_num | +--------------+ | 148 | | 356 | | 408 | | 282 | | 608 | +--------------+ 5 rows in set (0.00 sec) DISTINCT used with COUNT Count the number of customers who currently have open orders. mysql> SELECT COUNT(customer_num) -> FROM orders; +---------------------+ | COUNT(customer_num) | +---------------------+ | 7 | +---------------------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(DISTINCT customer_num) -> FROM orders; +------------------------------+ | COUNT(DISTINCT customer_num) | +------------------------------+ | 5 | +------------------------------+ 1 row in set (0.00 sec) Describing Groups of Data SELECT column(s), ... group_function(column) FROM table(s) WHERE row condition GROUP BY column(s) HAVING group condition ORDER BY column(s) LIMIT m, n; Using the GROUP BY clause GROUP BY clause allows rows that share some common characteristics to be grouped Multiple columns and expressions can be used for grouping Specified group functions are performed on each group Columns in the GROUP BY clause do not have to be in the SELECT list Grouping Data List class ID and the average unit price of products in each class. mysql> SELECT class, AVG(price) -> FROM part -> GROUP BY class; +-------+-------------+ | class | AVG(price) | +-------+-------------+ | AP | 400.988000 | | HW | 104.950000 | | SG | 1092.475000 | +-------+-------------+ 3 rows in set (0.00 sec) How does it work? Original PART table: Part_Num AT94 BV06 CD52 DL71 DR93 DW11 FD21 KL62 KT03 KV29 Description On_hand Class Warehouse Iron 50 HW 3 Home Gym 45 SG 2 Microwave Oven 32 AP 1 Cordless Drill 21 HW 3 Gas Range 8 AP 2 Washer 12 AP 3 Stand Mixer 22 HW 3 Dryer 12 AP 1 Dishwasher 8 AP 3 Treadmill 9 SG 2 Price 24.95 794.95 165.00 129.95 495.00 399.99 159.95 349.95 595.00 1390.00 How does it work? PART table sorted by “class”: Part_Num CD52 DR93 DW11 KL62 KT03 AT94 DL71 FD21 BV06 KV29 Description On_hand Class Warehouse Microwave Oven 32 AP 1 Gas Range 8 AP 2 Washer 12 AP 3 Dryer 12 AP 1 Dishwasher 8 AP 3 Iron 50 HW 3 Cordless Drill 21 HW 3 Stand Mixer 22 HW 3 Home Gym 45 SG 2 Treadmill 9 SG 2 We have 5 rows in AP class, 3 rows in HW class, 2 rows in SG class Price 165.00 495.00 399.99 349.95 595.00 24.95 129.95 159.95 794.95 1390.00 AVR = 400.99 AVR = 104.95 AVR = 1092.48 Do use group functions with GROUP BY List class and average unit price in each class. mysql> SELECT class, price -> FROM part -> GROUP BY class; +-------+--------+ | class | price | +-------+--------+ | AP | 165.00 | | HW | 24.95 | | SG | 794.95 | +-------+--------+ 3 rows in set (0.00 sec) Grouping with GROUP_CONCAT() The GROUP_CONCAT() function returns a string result with the concatenated values from a group mysql> SELECT class, -> GROUP_CONCAT(DISTINCT description) List -> FROM part -> GROUP BY class; +-------+--------------------------------------------------+ | class | List | +-------+--------------------------------------------------+ | AP | Washer,Dishwasher,Microwave Oven,Dryer,Gas Range | | HW | Stand Mixer,Iron,Cordless Drill | | SG | Home Gym,Treadmill | +-------+--------------------------------------------------+ 3 rows in set (0.00 sec) Grouping with GROUP_CONCAT() mysql> SELECT class, -> GROUP_CONCAT(DISTINCT description) List -> FROM part -> GROUP BY class \G *************************** 1. row *************************** class: AP List: Washer,Dishwasher,Microwave Oven,Dryer,Gas Range *************************** 2. row *************************** class: HW List: Stand Mixer,Iron,Cordless Drill *************************** 3. row *************************** class: SG List: Home Gym,Treadmill 3 rows in set (0.00 sec) Using WITH ROLLUP clause For each warehouse and class, provide the average price of part. Also provide the average price in each warehouse. mysql> SELECT warehouse, class, AVG(price) -> FROM part -> GROUP BY warehouse, class -> WITH ROLLUP; +-----------+-------+-------------+ | warehouse | class | AVG(price) | +-----------+-------+-------------+ | 1 | AP | 257.475000 | | 1 | NULL | 257.475000 | | 2 | AP | 495.000000 | | 2 | SG | 1092.475000 | | 2 | NULL | 893.316667 | | 3 | AP | 497.495000 | | 3 | HW | 104.950000 | | 3 | NULL | 261.968000 | | NULL | NULL | 450.474000 | +-----------+-------+-------------+ 9 rows in set (0.00 sec) Counting the rows in a group List each credit limit and the number of customers having each credit limit. mysql> SELECT credit_limit, COUNT(*) -> FROM customer -> GROUP BY credit_limit; +--------------+----------+ | credit_limit | COUNT(*) | +--------------+----------+ | 5000.00 | 2 | | 7500.00 | 4 | | 10000.00 | 3 | | 15000.00 | 1 | +--------------+----------+ 4 rows in set (0.00 sec) Using a HAVING clause List the order number and the total value for orders over $1,000. mysql> SELECT order_num, -> SUM(num_ordered*quoted_price) total -> FROM order_line -> GROUP BY order_num -> HAVING SUM(num_ordered*quoted_price) > 1000; +-----------+---------+ | order_num | total | +-----------+---------+ | 21613 | 1319.80 | | 21614 | 1190.00 | | 21617 | 2189.90 | | 21623 | 2580.00 | +-----------+---------+ 4 rows in set (0.00 sec) Displaying specific groups List each credit limit and the number of customers having each credit limit held by more than one customer. mysql> SELECT credit_limit, COUNT(*) -> FROM customer -> GROUP BY credit_limit -> HAVING COUNT(*) > 1; +--------------+----------+ | credit_limit | COUNT(*) | +--------------+----------+ | 5000.00 | 2 | | 7500.00 | 4 | | 10000.00 | 3 | +--------------+----------+ 3 rows in set (0.00 sec) HAVING vs. WHERE WHERE clause limits/restricts individual rows HAVING clause limits/restricts output to certain groups on the basis of aggregate information Restricting the rows to be grouped List each credit limit and the number of customers of sales rep 20 that have this credit limit. mysql> SELECT credit_limit, COUNT(*) -> FROM customer -> WHERE rep_num = '20' -> GROUP BY credit_limit; +--------------+----------+ | credit_limit | COUNT(*) | +--------------+----------+ | 7500.00 | 2 | | 15000.00 | 1 | +--------------+----------+ 2 rows in set (0.00 sec) Restricting the rows and the groups Repeat previous example, but list only those credit limits held by more than one customer. mysql> SELECT CREDIT_LIMIT, COUNT(*) -> FROM CUSTOMER -> WHERE REP_NUM = '20' -> GROUP BY CREDIT_LIMIT -> HAVING COUNT(*) > 1; +--------------+----------+ | CREDIT_LIMIT | COUNT(*) | +--------------+----------+ | 7500.00 | 2 | +--------------+----------+ 1 row in set (0.00 sec)