Database technology Lecture 2: Relational databases and SQL Jose M. Peña jose.m.pena@liu.se Database design process 1 Relational model concepts Attributes ... EMPLOYEE Tuples ... FNAME M LNAME ... SSN BDATE ADDRESS S SALARY SUPERSSN DNO 5 Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 Joyce A English 453453453 1972-07-31 … F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James E Borg 888665555 1937-11-10 … M 55000 null 1 Relation: Set of tuples, i.e. no duplicates are allowed. Database: Collection of relations. EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) Relation schema 3 Relational model concepts String shorter than 30 chars Domain EMPLOYEE yyyy-mm-dd FNAME M LNAME SSN BDATE Integer 400 < x < 8000 Character M or F ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce Null English 453453453 1972-07-31 … F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James Null Borg 888665555 1937-11-10 … M 55000 Null 1 NULL value 4 2 Relational model constraints EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce Null English 453453453 1972-07-31 … F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James Null Borg 888665555 1937-11-10 … M 55000 Null 1 Entity integrity constraint 5 Relational model constraints Foreign keys EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce A English 453453453 1972-07-31 … F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James E Borg 888665555 1937-11-10 … M 55000 Null 1 Referential integrity constraint DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE Research 5 666884444 1988-05-22 Administration 4 987987987 1995-01-01 Headquarters 1 888665555 1981-06-19 6 3 Relational model constraints (Atomic) domain (or NULL). Key. Entity integrity: A PK cannot take NULL values. Referential integrity: A FK in a relation can only refer to the PK of another relation, and the domains of the FK and PK must coincide, and the FK takes NULL value or values that exist for the PK. 7 SQL relational data model relation SQL table attribute column tuple row Used in many DBMSs. Declarative (what data to get, not how). DDL (Data Definition Language) Queries CREATE, ALTER, DROP SELECT DML (Data Manipulation Language) INSERT, DELETE, UPDATE … 8 4 COMPANY schema EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO) DEPT_LOCATIONS (DNUMBER, DLOCATION) DEPARTMENT (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) WORKS_ON (ESSN, PNO, HOURS) PROJECT (PNAME, PNUMBER, PLOCATION, DNUM) DEPENDENT (ESSN, DEPENDENT-NAME, SEX, BDATE, RELATIONSHIP) 9 Create tables CREATE TABLE <tablename> ( <colname> <datatype> [<constraint>], …, [<constraint>], … ); Data types: Integer, decimal, number, varchar,char, etc. Constraints: Not null, primary key, foreign key, unique, etc. 10 5 Create tables CREATE TABLE WORKS_ON ( ESSN integer, PNO integer, HOURS decimal(3,1), constraint pk_workson primary key (ESSN, PNO), constraint fk_works_emp FOREIGN KEY (ESSN) references EMPLOYEE(SSN), constraint fk_works_proj FOREIGN KEY (PNO) references PROJECT(PNUMBER) ); 11 Modify tables Change the definition of a table: Add, delete and modify columns and constraints. ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; ALTER TABLE WORKS_ON DROP FOREIGN KEY fk_works_emp; ALTER TABLE WORKS_ON ADD CONSTRAINT fk_works_emp FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN); Delete a table and its definition DROP TABLE EMPLOYEE; 12 6 Query tables SELECT <attribute-list> FROM <table-list> WHERE <condition>; Attribute list: A1, …, Ar Attributes whose values are required. Table list: R1, …, Rk Condition: Boolean expression Relations to be queried It identifies the tuples that should be retrieved. It may include comparison operators(=, <>, >, >=, etc.) and/or logical operators (and, or, not). 13 Simple query List the SSN for all employees. SELECT SSN FROM EMPLOYEE; SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555 14 7 Use of * List all information about the employees of department 5. SELECT FNAME, MINIT, LNAME,SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO FROM EMPLOYEE WHERE DNO = 5; or Comparison operators SELECT * FROM EMPLOYEE WHERE DNO = 5; {=, <>, >, =>, etc.} 15 Simple query List the last name, birth date and address for all employees whose name is `Alicia J. Zelaya‘. SELECT LNAME, BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME = ‘Alicia’ AND MINIT = ‘J’ AND LNAME = ‘Zelaya’; Logical operators {and, or, not} LNAME BDATE ADDRESS Zelaya 1968-07-19 3321 Castle, Spring, TX 16 8 Pattern matching List the birth date and address for all employees whose last name contains the substring ‘aya’. SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE LNAME LIKE ‘%aya%’; LIKE comparison operator % replaces 0 or more characters _ replaces a single character LNAME BDATE Zelaya 1968-07-19 Narayan 1962-09-15 ADDRESS 3321 Castle, Spring, TX 975 Fire Oak, Humble, TX Tables as sets List all salaries. SELECT SALARY FROM EMPLOYEE; 17 SALARY 30000 40000 25000 43000 38000 25000 25000 55000 18 9 Tables as sets SQL considers a table as a multi-set (bag), i.e. tuples can occur more than once in a table. This is different in a relational model. Why? Removing duplicates is expensive. User may want information about duplicates. Aggregation operators. 19 Example List all salaries. SELECT SALARY FROM EMPLOYEE; SALARY 30000 40000 25000 43000 38000 25000 25000 55000 SALARY List all salaries without duplicates. SELECT DISTINCT SALARY FROM EMPLOYEE; 30000 40000 25000 43000 38000 55000 20 10 Set operations Duplicate tuples are removed. Queries can be combined by set operations: UNION, INTERSECT, EXCEPT (MySQL only supports UNION) Retrieve the first names of all people in the database. SELECT FNAME FROM EMPLOYEE D E UNION SELECT DEPENDENT_NAME FROM DEPENDENT; Which department managers have dependents? Show their SSN. SELECT MGRSSN FROM DEPARTMENT INTERSECT SELECT ESSN FROM DEPENDENT; M DE 21 Ambiguous names: Aliasing What if the same attribute name is used in different relations ? No alias SELECT NAME, NAME FROM EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER; Whole name SELECT EMPLOYEE.NAME, DEPARTMENT.NAME FROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.DNO=DEPARTMENT.DNUMBER; Alias SELECT E.NAME, D.NAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.DNO=D.DNUMBER; 22 11 Join: Cartesian product LNAME List all employees and the names of their departments. SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT; EMPLOYEE LNAME DNO Smith Wong Zelaya Wallace Narayan English Jabbar Borg 5 5 4 4 5 5 4 1 DEPARTMENT DNAME DNUM Research Administration headquarters 5 4 1 DNAME Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Research Research Research Research Research Research Research Research Administration Administration Administration Administration Administration Administration Administration Administration Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters 23 Foreign key in EMPLOYEE Join: Equijoin List all employees and the names of their departments. SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT WHERE DNO = DNUMBER; Equijoin Thetajoin {=, <>, >, =>, <=, !=} Cartesian product Primary key in DEPARTMENT LNAME DNO DNAME Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Research Research Research Research Research Research Research Research Administration Administration Administration Administration Administration Administration Administration Administration Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters 5 5 4 4 5 5 4 1 5 5 4 4 5 5 4 1 5 5 4 4 5 5 4 1 DNUMBER 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 24 12 Join: Self-join List the last name for all employees together with the last names of their bosses. SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; Employee Boss Smith Wong Zelaya Wallace Narayan English Jabbar Wong Borg Wallace Borg Wong Wong Wallace 25 Join: Inner join List the last name for all employees together with the last names of their bosses. SELECT E.LNAME “Employee”, S.LNAME “Boss” FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; SELECT E.LNAME “Employee”, S.LNAME “Boss” FROM EMPLOYEE E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; 26 13 Join: Outer join List the last name for all employees and, if available, show the last names of their bosses. Employee Boss Smith Wong Zelaya Wallace Narayan English Jabbar Borg SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E LEFT JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; LEFT JOIN, RIGHT JOIN, FULL JOIN Wong Borg Wallace Borg Wong Wong Wallace NULL 27 Joins revisited A Cartesian product SELECT * FROM a, b; B A1 A2 B1 B2 100 A 100 W null B 200 X A2 A1 B1 B2 300 C null Y A 100 100 W null D null Z B null 100 W C 300 100 W D null 100 W A 100 200 X B null 200 X C 300 200 X D null 200 X A 100 null Y Equijoin, natural join, inner join SELECT * from A, B WHERE A1=B1; A2 A1 B1 B2 A 100 100 W B null null Y C 300 null Y D null null Y A 100 null Z B null null Z A2 A1 B1 B2 C 300 null Z C 300 100 W D null null Z C 300 200 X Thetajoin SELECT * from A, B WHERE A1>B1; 28 14 Outer joins revisited Right outer join SELECT * FROM A RIGHT JOIN B on A1=B1; A2 A1 A B A1 A2 B1 B2 100 A 100 W null B 200 X B1 B2 300 C null Y null D null Z A 100 100 W null null 200 X null null null Y null null null Z Full outer join (union of right+left) SELECT * FROM A FULL JOIN b on A1=B1; Left outer join A2 A1 B1 B2 SELECT * FROM A LEFT JOIN B on A1=B1; A 100 100 W null null 200 X null null null Y A2 A1 B1 B2 A 100 100 W C 300 null null B null null null D null null null null null null Z C 300 null null B null null null D null null null 29 Subqueries List all employees that do not have any project assignment with more than 10 hours. SELECT LNAME FROM EMPLOYEE, WORKS_ON WHERE SSN = ESSN AND HOURS <= 10.0; {>, >=, <, <=, <>} + {ANY, SOME, ALL} SELECT LNAME FROM EMPLOYEE WHERE SSN NOT IN (SELECT ESSN FROM WORKS_ON WHERE HOURS > 10.0); Or EXISTS SELECT LNAME FROM EMPLOYEE WHERE NOT EXISTS (SELECT * FROM WORKS_ON WHERE SSN = ESSN AND HOURS > 10.0); 30 15 Extended SELECT syntax SELECT <attribute-list and function-list> FROM <table-list> [ WHERE <condition> ] [ GROUP BY <grouping attribute-list>] [ HAVING <group condition> ] [ ORDER BY <attribute-list> ]; 31 Aggregate functions Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT() They appear only in SELECT and HAVING clauses. NULL values are not considered in the computations. List the total number of employees. SELECT COUNT(*) FROM EMPLOYEE; 50 50 100 100 Null 0 AVG() 75 50 32 16 Grouping Used to apply an aggregate function to subgroups of tuples in a relation. GROUP BY: Grouping attributes. HAVING: Condition that a group has to satisfy. List, for each department with more than two employees, the department number, the number of employees and the average salary. SELECT DNO, COUNT(*), AVG(SALARY) FROM EMPLOYEE DNO COUNT(*) AVG(SALARY) GROUP BY DNO HAVING COUNT(*) > 2; 5 4 33250 4 1 3 1 31000 55000 33 Sort query results Show the department names and their locations in alphabetical order. SELECT DNAME, DLOCATION FROM DEPARTMENT D, DEPT_LOCATIONS DL WHERE D.DNUMBER = DL.DNUMBER ORDER BY DNAME ASC, DLOCATION DESC; DNAME Administration Headquarters Research Research Research DLOCATION Stafford Houston Sugarland Houston Bellaire 34 17 Null values List all employees that do not have a boss. SELECT FNAME, LNAME FROM EMPLOYEE WHERE SUPERSSN IS NULL; ‘SUPERSSN = NULL’ and ‘SUPERSSN <> NULL’ will not return any matching tuples, because NULL is incomparable to any value, including another NULL. 35 Insert data INSERT INTO <table> (<attr>,…) VALUES ( <val>, …) ; INSERT INTO <table> (<attr>, …) <subquery> ; Store information about how many hours an employee works for the project ’1' into WORKS_ON. INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5); Integrity constraint! Referential integrity constraint! 36 18 Update data UPDATE <table> SET <attr> = <val> ,… WHERE <condition> ; UPDATE <table> SET (<attr>, ….) = ( <subquery> ) WHERE <condition> ; Give all employees in the ‘Research’ department a 10% raise in salary. Integrity constraint! Referential integrity constraint! UPDATE EMPLOYEE SET SALARY = SALARY*1.1 WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME = ‘Research’); 37 Delete data DELETE FROM <table> WHERE <condition> ; Delete the employees having the last name ‘Borg’ from the EMPLOYEE table. DELETE FROM EMPLOYEE WHERE LNAME = ‘Borg’; EMPLOYEE FNAME M LNAME SSN Foreign key DEPARTMENT DNAME DNUMBER MGRSSN Ramesh K Narayan 666884444 Research 5 333445555 Joyce A English 453453453 Administration 4 987654321 Ahmad V Jabbar 987987987 Headquarters 1 888665555 James E Borg 888665555 ON DELETE SET NUL / DEFAULT / CASCADE ? Referential integrity constraint! 38 19 Views A virtual table derived from another (possibly virtual) tables, i.e. always up-to-date. CREATE VIEW dept_view AS SELECT DNO, COUNT(*), AVG(SALARY) FROM EMPLOYEE GROUP BY DNO; Why? Simplify query commands. Provide data security. Enhance programming productivity. Update problems. 39 20