Overview Real world Relational databases and SQL Query Model Database Answer Processing of queries and updates DBMS Access to stored data He Tan hetan@ida.liu.se Physical database 2 What you will learn Relational data model concepts, Relational data model First introduced in 1970 by E.F. Codd at the IBM Research Laboratory in San Jose, California RDBMS, e.g. IBM's DB2, Oracle, MySQL, Microsoft Access Other models: hierarchical, network, objectrelational constraints SQL (Structured Query Language) query, declare, update 3 Relational model concepts 4 Relational model concepts Database – collection of relations String shorter than 30 chars Attributes ... EMPLOYEE Tuples ... Domain ... yyyy-mm-dd FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce A English 453453453 1972-07-31 … F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James E Borg 888665555 1937-11-10 … M 55000 null 1 EMPLOYEE Integer 400 < x < 8000 Character M or F FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce Null English 453453453 1972-07-31 … F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James Null Borg 888665555 1937-11-10 … M 55000 888665555 1 Relation – a set of tuples NULL value Relation schema EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO) 5 6 1 Relational model constraints domain constraint String shorter than 30 chars yyyy-mm-dd EMPLOYEE Relational model constraints Integer 400 < x < 8000 Character M or F Foreign keys EMPLOYEE FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce A English 453453453 1972-07-31 … F 38000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James E Borg 888665555 1937-11-10 … M 55000 null 1 FNAME M LNAME SSN BDATE ADDRESS S SALARY SUPERSSN DNO Ramesh K Narayan 666884444 1962-09-15 … M 38000 888665555 5 Joyce A English 453453453 1972-07-31 … F 25000 888665555 5 Ahmad V Jabbar 987987987 1969-03-29 … M 25000 888665555 4 James E Borg 888665555 1937-11-10 … M 55000 null 1 referential integrity constraint constraints on NULL values Primary key Relation – set of tuples, (Candidate) keys i.e. no duplicates Superkey entity integrity key constraints + constraint DEPARTMENT DNAME DNUMBER MGRSSN MGRSTARTDATE Research 5 666884444 1988-05-22 Administration 4 987987987 1995-01-01 Headquarters 1 888665555 1981-06-19 7 8 Relational model constraints Real world (Atomic) domain (or NULL). Key. Entity integrity: PK is NOT NULL. NOT NULL Referential integrity: FK of R referring to S if Model Database DBMS Query Answer Processing of queries and updates Access to stored data domain(FK(R))=domain(PK(S)) r.FK = s.PK for some s, otherwise NULL. Physical database 9 What you will learn SQL Relational data model concepts, 10 constraints Structured Query Language DDL, DML, … (what, not how) Declarative Developed by IBM Research as interface to System R. (1970s, SEQUEL) Used in many database systems SQL (Structured Query Language) query, declare, update 11 12 2 COMPANY schema EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO) DEPT-LOCATIONS (DNUMBER, DLOCATION) DEPARTMENT (DNAME, DNUMBER, MGRSSN, MGRSTARTDATE) WORKS-ON (ESSN, PNO, HOURS) PROJECT (PNAME, PNUMBER, PLOCATION, DNUM) DEPENDENT (ESSN, DEPENDENT-NAME, SEX, BDATE, RELATIONSHIP) Create tables CREATE TABLE <tablename> ( <colname> <datatype> [<constraint>], …, Relation data model [<constraint>], relation attribute … tuple ); data Create tables types: integer, decimal, number, varchar2 … not null, primary key, foreign key, unique 14 Modify tables CREATE TABLE WORKS_ON ( ESSN integer constraint fk_works_emp references EMPLOYEE(SSN), PNO integer constraint fk_works_proj references PROJECT(PNUMBER), HOURS decimal(3,1), constraint pk_workson primary key (ESSN, PNO) ); Change the definition of a table: add, delete and modify columns and constraints ALTER TABLE EMPLOYEE ADD JOB VARCHAR2(12); ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE; ALTER TABLE DEPTS-INFO DROP CONSTRAINT DInfo_Dept; Delete a table and its definition DROP TABLE EMPLOYEE; 15 Query tables table-list: R1, …, Rk condition: conditional (boolean) expression 16 Simple query SELECT <attribute-list> FROM <table-list> WHERE <condition>; attribute-list: R1.A1, …, Rk.Ar column row constraints: 13 SQL table List SSN for all employees SELECT SSN FROM EMPLOYEE; Attributes whose values to be required Relations to be queried identifies the tuples that should be retrieved logical operators (and, or, not) comparison operators(=, <>, >, >=, …) 17 SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555 18 3 Use of * Simple query List all information about the employees of department 5 SELECT FNAME, MINIT, LNAME,SSN, BDATE, ADDRESS, SEX, SALARY, SUPERSSN, DNO FROM EMPLOYEE WHERE DNO = 5; or SELECT * FROM EMPLOYEE WHERE DNO = 5; SELECT LNAME, BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME = ‘Alicia’ AND MINIT = ‘J’ AND LNAME = ‘Zelaya’; Zelaya 1968-07-19 3321 Castle, Spring, TX 20 SQL considers a table as a multi-set (bag), i.e. tuples can occur more than once in a table Difference wrt Relational model Why? Removing duplicates is expensive may want information about duplicates Aggregation operators User LIKE comparison operator % replaces 0 or more characters _ replaces a single character BDATE ADDRESS SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE LNAME LIKE ‘%aya%’; Zelaya 1968-07-19 Narayan 1962-09-15 BDATE Tables as sets List birth date and address for all employees whose last name contains the substring ‘aya’ LNAME LNAME 19 Exact vs pattern matching List last name, birth date and address for all employees whose name is `Alicia J. Zelaya' ADDRESS 3321 Castle, Spring, TX 975 Fire Oak, Humble, TX Example SALARY List all salaries SELECT SALARY FROM EMPLOYEE; 30000 40000 25000 43000 38000 25000 25000 55000 21 22 Set operations Duplicate tuples are removed. Queries can be combined by set operations: UNION, INTERSECT, EXCEPT (MySQL only supports UNION) Retrieve all first names of all people in our mini world SELECT FNAME FROM EMPLOYEE UNION SELECT DEPENDENT_NAME FROM DEPENDENT; SALARY List all salaries without duplicates. SELECT DISTINCT SALARY FROM EMPLOYEE; 30000 40000 25000 43000 38000 55000 Which department managers have dependents? Show their SSN. SELECT MGRSSN FROM DEPARTMENT INTERSECT SELECT ESSN FROM DEPENDENT; 23 24 4 Foreign key in EMPLOYEE Join. Cartesian product List all employees and their department SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT; Smith Wong Zelaya Wallace Narayan English Jabbar Borg Research Administration headquarters LNAME Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg DNAME Join. Equijoin Research Research Research Research Research Research Research Research Administration Administration Administration Administration Administration Administration Administration Administration Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters List all employees and their SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT WHERE DNO = DNUMBER; Equijoin Thetajoin {=, <>, >, >=, =<, !=} Result: each tuple in EMPLOYEE is combined with each tuple in DEPARTMENT Cartesian product Primary key in DEPARTMENT LNAME DNO DNAME Smith 5 Wong 5 Zelaya 4 department Wallace 4 Narayan 5 English 5 Jabbar 4 Borg 1 Smith 5 Wong 5 Zelaya 4 Wallace 4 Narayan 5 English 5 Jabbar 4 Borg 1 Smith 5 Wong 5 Zelaya 4 Wallace 4 Narayan 5 English 5 Jabbar 4 Borg 1 DNUMBER Research Research Research Research Research Research Research Research Administration Administration Administration Administration Administration Administration Administration Administration Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters Headquarters 5 5 5 5 5 5 5 5 4 4 4 4 4 4 4 4 1 1 1 1 1 1 1 1 25 26 Ambiguous names. Aliasing Join. Self-join Same attribute name used in different relations No alias SELECT LNAME, DNAME FROM EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER; Whole name SELECT EMPLOYEE.LNAME, SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; DEPARTMENT.DNAME FROM EMPLOYEE, DEPARTMENT WHERE EMPLOYEE.DNO= DEPARTMENT.DNUMBER; Alias List last name for all employees together with last names of their bosses SELECT E.LNAME, D.NAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.DNO=D.DNUMBER; Employee Boss Smith Wong Zelaya Wallace Narayan English Jabbar Wong Borg Wallace Borg Wong Wong Wallace 27 28 Join. Outer join Join. Inner join Introduction SELECT E.LNAME, E.SUPERSSN, S.LNAME, S.SSN FROM EMPLOYEE E, EMPLOYEE S Cartesian product E.LNAME E.SUPERSSN S.LNAME S.SSN List last name for all employees together with last names of their bosses List last name for all employees together with last names of their bosses SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E, EMPLOYEE S WHERE E.SUPERSSN = S.SSN; SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E INNER JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; 29 Inner join: matching tuple exists in the other relation, i.e. an employee “Borg” is not included in the answer Use “outer join” Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong Zelaya Wallace Narayan English Jabbar Borg Smith Wong ... 333445555 888665555 987654321 888665555 333445555 333445555 987654321 NULL 333445555 888665555 987654321 888665555 333445555 333445555 987654321 NULL 333445555 888665555 Smith Smith Smith Smith Smith Smith Smith Smith Wong Wong Wong Wong Wong Wong Wong Wong Zelaya Zelaya 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 333445555 333445555 333445555 333445555 333445555 333445555 333445555 333445555 999887777 999887777 30 5 Joins – revisited Join. Outer join A Cartesian product SELECT * FROM a, b; List last name for all employees and, if available, show last names of their bosses Employee Boss Smith Wong Zelaya Wallace Narayan English Jabbar Borg SELECT E.LNAME “Employee”, S. LNAME “Boss” FROM EMPLOYEE E LEFT JOIN EMPLOYEE S ON E.SUPERSSN = S.SSN; The LEFT JOIN returns all the rows from the first table (the left side of OUTER JOIN operator). Wong Borg Wallace Borg Wong Wong Wallace NULL B A1 A2 B1 B2 100 A 100 W null B 200 X A2 A1 B1 B2 300 C null Y A 100 100 W null D null Z B null 100 W C 300 100 W D null 100 W A 100 200 X B null 200 X C 300 200 X D null 200 X A 100 null Y B null null Y C 300 null Y D null null Y A 100 null Z Equijoin, natural join, inner join SELECT * from a, b WHERE a1=b1; A2 A1 B1 B2 A 100 100 W Thetajoin SELECT * from a, b WHERE a1>b1; B null null Z A2 A1 B1 B2 C 300 null Z C 300 100 W D null null Z C 300 200 X 31 Outer Joins – revisited A Right outer join SELECT * FROM a RIGHT JOIN b on a1=b1; Subqueries B A1 A2 B1 B2 100 A 100 W null B 200 X A2 A1 B1 B2 300 C null Y A 100 100 W null D null Z null null 200 X null null null Y null null null Z SELECT * FROM a LEFT JOIN b on a1=b1; Full outer join (union of right+left) A2 A1 B1 B2 A 100 100 W C 300 null null B null null null D null null null A2 A1 B1 B2 A 100 100 W null null 200 X null null null Y null null null Z C 300 null null B null null null D null null null Which employees have a 10 hour (exact) project assignment? SELECT * FROM a FULL JOIN b on a1=b1; Left outer join 32 Following query returns duplicates (why?): SELECT LNAME FROM EMPLOYEE, WORKS_ON WHERE SSN = ESSN AND HOURS = 10.0; {>, >=, <, <=, <>} SELECT LNAME + FROM EMPLOYEE {ANY, SOME, ALL} WHERE SSN IN (SELECT ESSN FROM WORKS_ON WHERE HOURS = 10.0); Or NOT EXISTS SELECT LNAME FROM EMPLOYEE WHERE EXISTS (SELECT * FROM WORKS_ON WHERE SSN = ESSN AND HOURS = 10.0); 33 SQL syntax – More complex 34 Aggregate functions SELECT <attribute-list and function-list> FROM <table-list> [ WHERE <condition> ] [ GROUP BY <grouping attribute-list>] [ HAVING <group condition> ] [ ORDER BY <attribute-list> ]; Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT() List the number of employees SELECT COUNT(*) FROM EMPLOYEE; appear in SELECT and HAVING clauses! NULL: is eliminated 50 50 100 100 Null 0 AVG() 75 35 50 36 6 Grouping Order of query results Used to apply an aggregate function to subgroups of tuples in a relation GROUP BY – grouping attributes HAVING – condition that a group has to satisfy Select department names and their locations in alphabetical order. SELECT DNAME, DLOCATION FROM DEPARTMENT D, DEPT_LOCATIONS DL WHERE D.DNUMBER = DL.DNUMBER ORDER BY DNAME ASC, DLOCATION DESC; List for each department the department number, the number of employees and the average salary. Separate group for all tuples with NULL value of the grouping attribute SELECT DNO, COUNT(*), AVG(SALARY) FROM EMPLOYEE DNO COUNT(*) AVG(SALARY) GROUP BY DNO HAVING COUNT(*) > 2; 5 4 33250 4 1 3 1 DNAME Administration Headquarters Research Research Research 31000 55000 DLOCATION Stafford Houston Sugarland Houston Bellaire 37 Null values 38 Insert new data INSERT INTO <table> (<attr>,…) VALUES ( <val>, …) ; INSERT INTO <table> (<attr>, …) <subquery> ; List all employees that do not have a boss. Store information about how many hours an employee works for the project ’1' into WORKS_ON INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5); SELECT FNAME, LNAME FROM EMPLOYEE WHERE SUPERSSN IS NULL; ‘SUPERSSN = NULL’ and ‘SUPERSSN <> NULL’ will not return any matching tuples Integrity constraint! Referential integtiry constraint! 39 Managing data. Modify stored data 40 Managing data. Delete stored data DELETE FROM <table> WHERE <condition> ; UPDATE <table> SET <attr> = <val> ,… WHERE <condition> ; UPDATE <table> SET (<attr>, ….) = ( <subquery> ) WHERE <condition> ; Delete employees having the last name ‘Borg’ from the EMPLOYEE table DELETE FROM EMPLOYEE Give all employees in the ‘Research’ department a 10 raise in salary. Integrity constraint! referential integrity constraints WHERE LNAME = ‘Borg’; Referential integtiry constraint! UPDATE EMPLOYEE SET SALARY = SALARY*1.1 WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME = ‘Research’); 41 EMPLOYEE Foreign key FNAME M LNAME SSN DNAME DNUMBER MGRSSN Ramesh K Narayan 666884444 Research 5 333445555 Joyce A English 453453453 Administration 4 987654321 Ahmad V Jabbar 987987987 Headquarters 1 888665555 James E Borg 888665555 DEPARTMENT SET NULL ? SET DEFAULT ? CASCADE ? 42 7 Views A virtual table derived from other – possible virtual -- tables. VIEW dept_view AS SELECT DNUMBER, DNAME FROM DEPARTMENT; CREATE Why? Simplify query commands data security Enhance programming productivity Provide Update problems 43 8