Database design process 1 Database technology Lecture 2: Relational databases and SQL

advertisement
Database technology
Lecture 2: Relational databases and SQL
Jose M. Peña
jose.m.pena@liu.se
Database design process
1
Relational model concepts
Attributes
...
EMPLOYEE
Tuples
...
FNAME
M
LNAME
...
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
5
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
Joyce
A
English
453453453
1972-07-31
…
F
25000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
E
Borg
888665555
1937-11-10
…
M
55000
null
1
Relation: Set of tuples, i.e. no duplicates are allowed.
Database: Collection of relations.
EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO)
Relation schema
3
Relational model concepts
String shorter than 30 chars
Domain
EMPLOYEE
yyyy-mm-dd
FNAME
M
LNAME
SSN
BDATE
Integer
400 < x < 8000
Character
M or F
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
Null
English
453453453
1972-07-31
…
F
38000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
Null
Borg
888665555
1937-11-10
…
M
55000
Null
1
NULL value
4
2
Relational model constraints
EMPLOYEE
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
Null
English
453453453
1972-07-31
…
F
38000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
Null
Borg
888665555
1937-11-10
…
M
55000
Null
1
Entity integrity constraint
5
Relational model constraints
Foreign keys
EMPLOYEE
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
A
English
453453453
1972-07-31
…
F
25000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
E
Borg
888665555
1937-11-10
…
M
55000
Null
1
Referential integrity constraint
DEPARTMENT
DNAME
DNUMBER
MGRSSN
MGRSTARTDATE
Research
5
666884444
1988-05-22
Administration
4
987987987
1995-01-01
Headquarters
1
888665555
1981-06-19
6
3
Relational model constraints
(Atomic) domain (or NULL).
Key.
Entity integrity: A PK cannot take NULL values.
Referential integrity: A FK in a relation can
only refer to the PK of another relation, and the
domains of the FK and PK must coincide, and
the FK takes NULL value or values that exist
for the PK.
7
SQL
relational data model
relation
SQL
table
attribute
column
tuple
row
Used in many DBMSs.
Declarative (what data to get, not how).
DDL (Data Definition Language)
Queries
CREATE, ALTER, DROP
SELECT
DML (Data Manipulation Language)
INSERT,
DELETE, UPDATE …
8
4
COMPANY schema
EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE,
ADDRESS, SEX, SALARY, SUPERSSN, DNO)
DEPT_LOCATIONS (DNUMBER, DLOCATION)
DEPARTMENT (DNAME, DNUMBER, MGRSSN,
MGRSTARTDATE)
WORKS_ON (ESSN, PNO, HOURS)
PROJECT (PNAME, PNUMBER, PLOCATION, DNUM)
DEPENDENT (ESSN, DEPENDENT-NAME, SEX,
BDATE, RELATIONSHIP)
9
Create tables
CREATE TABLE <tablename> (
<colname> <datatype> [<constraint>],
…,
[<constraint>],
…
);
Data
types: Integer, decimal, number, varchar,char, etc.
Constraints: Not null, primary key, foreign key, unique, etc.
10
5
Create tables
CREATE TABLE WORKS_ON (
ESSN integer,
PNO
integer,
HOURS decimal(3,1),
constraint pk_workson
primary key (ESSN, PNO),
constraint fk_works_emp
FOREIGN KEY (ESSN) references EMPLOYEE(SSN),
constraint fk_works_proj
FOREIGN KEY (PNO) references PROJECT(PNUMBER)
);
11
Modify tables
Change the definition of a table: Add, delete and modify
columns and constraints.
ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12);
ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE;
ALTER TABLE WORKS_ON DROP FOREIGN KEY fk_works_emp;
ALTER TABLE WORKS_ON ADD CONSTRAINT fk_works_emp
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN);
Delete a table and its definition
DROP TABLE EMPLOYEE;
12
6
Query tables
SELECT <attribute-list>
FROM <table-list>
WHERE <condition>;
Attribute list: A1, …, Ar
Attributes whose values are required.
Table list: R1, …, Rk
Condition: Boolean expression
Relations to be queried
It identifies the tuples that should be retrieved.
It may include comparison operators(=, <>, >,
>=, etc.) and/or logical operators (and, or, not).
13
Simple query
List the SSN for all employees.
SELECT SSN
FROM EMPLOYEE;
SSN
123456789
333445555
999887777
987654321
666884444
453453453
987987987
888665555
14
7
Use of *
List all information about the employees of
department 5.
SELECT FNAME, MINIT, LNAME,SSN, BDATE,
ADDRESS, SEX, SALARY, SUPERSSN, DNO
FROM EMPLOYEE
WHERE DNO = 5;
or
Comparison operators
SELECT *
FROM EMPLOYEE
WHERE DNO = 5;
{=, <>, >, =>, etc.}
15
Simple query
List the last name, birth date and address for all
employees whose name is `Alicia J. Zelaya‘.
SELECT LNAME, BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME = ‘Alicia’
AND MINIT = ‘J’
AND LNAME = ‘Zelaya’;
Logical operators
{and, or, not}
LNAME
BDATE
ADDRESS
Zelaya
1968-07-19
3321 Castle, Spring, TX
16
8
Pattern matching
List the birth date and address for all employees
whose last name contains the substring ‘aya’.
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE LNAME LIKE ‘%aya%’;
LIKE comparison operator
%
replaces 0 or more characters
_
replaces a single character
LNAME
BDATE
Zelaya
1968-07-19
Narayan 1962-09-15
ADDRESS
3321 Castle, Spring, TX
975 Fire Oak, Humble, TX
Tables as sets
List all salaries.
SELECT SALARY
FROM EMPLOYEE;
17
SALARY
30000
40000
25000
43000
38000
25000
25000
55000
18
9
Tables as sets
SQL considers a table as a multi-set (bag),
i.e. tuples can occur more than once in a
table. This is different in a relational model.
Why?
Removing
duplicates is expensive.
User may want information about duplicates.
Aggregation operators.
19
Example
List all salaries.
SELECT SALARY
FROM EMPLOYEE;
SALARY
30000
40000
25000
43000
38000
25000
25000
55000
SALARY
List all salaries without duplicates.
SELECT DISTINCT SALARY
FROM EMPLOYEE;
30000
40000
25000
43000
38000
55000
20
10
Set operations
Duplicate tuples are removed.
Queries can be combined by set operations: UNION,
INTERSECT, EXCEPT (MySQL only supports UNION)
Retrieve the first names of all people in the database.
SELECT FNAME FROM EMPLOYEE
D
E
UNION
SELECT DEPENDENT_NAME FROM DEPENDENT;
Which department managers have dependents? Show
their SSN.
SELECT MGRSSN FROM DEPARTMENT
INTERSECT
SELECT ESSN FROM DEPENDENT;
M
DE
21
Ambiguous names: Aliasing
What if the same attribute name is used in different
relations ?
No alias
SELECT NAME, NAME
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER;
Whole name
SELECT EMPLOYEE.NAME, DEPARTMENT.NAME
FROM EMPLOYEE, DEPARTMENT
WHERE EMPLOYEE.DNO=DEPARTMENT.DNUMBER;
Alias
SELECT E.NAME, D.NAME
FROM EMPLOYEE E, DEPARTMENT D
WHERE E.DNO=D.DNUMBER;
22
11
Join: Cartesian product
LNAME
List all employees and the names of
their departments.
SELECT LNAME, DNAME
FROM EMPLOYEE, DEPARTMENT;
EMPLOYEE
LNAME
DNO
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
5
5
4
4
5
5
4
1
DEPARTMENT
DNAME
DNUM
Research
Administration
headquarters
5
4
1
DNAME
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Research
Research
Research
Research
Research
Research
Research
Research
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
23
Foreign key in
EMPLOYEE
Join: Equijoin
List all employees and the
names of their departments.
SELECT LNAME, DNAME
FROM EMPLOYEE,
DEPARTMENT
WHERE DNO = DNUMBER;
Equijoin
Thetajoin
{=, <>, >, =>, <=, !=}
Cartesian product
Primary key in
DEPARTMENT
LNAME DNO
DNAME
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Research
Research
Research
Research
Research
Research
Research
Research
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
5
5
4
4
5
5
4
1
5
5
4
4
5
5
4
1
5
5
4
4
5
5
4
1
DNUMBER
5
5
5
5
5
5
5
5
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
24
12
Join: Self-join
List the last name for all employees together with the last
names of their bosses.
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.SUPERSSN = S.SSN;
Employee Boss
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Wong
Borg
Wallace
Borg
Wong
Wong
Wallace
25
Join: Inner join
List the last name for all employees together with the
last names of their bosses.
SELECT E.LNAME “Employee”, S.LNAME “Boss”
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.SUPERSSN = S.SSN;
SELECT E.LNAME “Employee”, S.LNAME “Boss”
FROM EMPLOYEE E INNER JOIN EMPLOYEE S
ON E.SUPERSSN = S.SSN;
26
13
Join: Outer join
List the last name for all employees
and, if available, show the last names
of their bosses.
Employee Boss
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
SELECT E.LNAME “Employee”, S. LNAME “Boss”
FROM EMPLOYEE E LEFT JOIN EMPLOYEE S
ON E.SUPERSSN = S.SSN;
LEFT JOIN, RIGHT JOIN, FULL JOIN
Wong
Borg
Wallace
Borg
Wong
Wong
Wallace
NULL
27
Joins revisited
A
Cartesian product
SELECT * FROM a, b;
B
A1
A2
B1
B2
100
A
100
W
null
B
200
X
A2
A1
B1
B2
300
C
null
Y
A
100
100
W
null
D
null
Z
B
null
100
W
C
300
100
W
D
null
100
W
A
100
200
X
B
null
200
X
C
300
200
X
D
null
200
X
A
100
null
Y
Equijoin, natural join, inner join
SELECT * from A, B WHERE A1=B1;
A2
A1
B1
B2
A
100
100
W
B
null
null
Y
C
300
null
Y
D
null
null
Y
A
100
null
Z
B
null
null
Z
A2
A1
B1
B2
C
300
null
Z
C
300
100
W
D
null
null
Z
C
300
200
X
Thetajoin
SELECT * from A, B WHERE A1>B1;
28
14
Outer joins revisited
Right outer join
SELECT * FROM A RIGHT JOIN B on A1=B1;
A2
A1
A
B
A1
A2
B1
B2
100
A
100
W
null
B
200
X
B1
B2
300
C
null
Y
null
D
null
Z
A
100
100
W
null
null
200
X
null
null
null
Y
null
null
null
Z
Full outer join (union of right+left)
SELECT * FROM A FULL JOIN b on A1=B1;
Left outer join
A2
A1
B1
B2
SELECT * FROM A LEFT JOIN B on A1=B1;
A
100
100
W
null
null
200
X
null
null
null
Y
A2
A1
B1
B2
A
100
100
W
C
300
null
null
B
null
null
null
D
null
null
null
null
null
null
Z
C
300
null
null
B
null
null
null
D
null
null
null
29
Subqueries
List all employees that do not have any project assignment with more than 10
hours.
SELECT LNAME FROM EMPLOYEE, WORKS_ON
WHERE SSN = ESSN AND HOURS <= 10.0;
{>, >=, <, <=, <>}
+
{ANY, SOME, ALL}
SELECT LNAME
FROM EMPLOYEE
WHERE SSN NOT IN (SELECT ESSN FROM WORKS_ON
WHERE HOURS > 10.0);
Or
EXISTS
SELECT LNAME
FROM EMPLOYEE
WHERE NOT EXISTS (SELECT * FROM WORKS_ON
WHERE SSN = ESSN AND HOURS > 10.0);
30
15
Extended SELECT syntax
SELECT <attribute-list and function-list>
FROM <table-list>
[ WHERE <condition> ]
[ GROUP BY <grouping attribute-list>]
[ HAVING <group condition> ]
[ ORDER BY <attribute-list> ];
31
Aggregate functions
Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT()
They appear only in SELECT and HAVING clauses.
NULL values are not considered in the computations.
List the total number of employees.
SELECT COUNT(*)
FROM EMPLOYEE;
50
50
100
100
Null
0
AVG() 75
50
32
16
Grouping
Used to apply an aggregate function to subgroups of
tuples in a relation.
GROUP BY: Grouping attributes.
HAVING: Condition that a group has to satisfy.
List, for each department with more than two employees,
the department number, the number of employees and
the average salary.
SELECT DNO, COUNT(*), AVG(SALARY)
FROM EMPLOYEE
DNO COUNT(*) AVG(SALARY)
GROUP BY DNO
HAVING COUNT(*) > 2;
5
4
33250
4
1
3
1
31000
55000
33
Sort query results
Show the department names and their locations in
alphabetical order.
SELECT DNAME, DLOCATION
FROM DEPARTMENT D, DEPT_LOCATIONS DL
WHERE D.DNUMBER = DL.DNUMBER
ORDER BY DNAME ASC, DLOCATION DESC;
DNAME
Administration
Headquarters
Research
Research
Research
DLOCATION
Stafford
Houston
Sugarland
Houston
Bellaire
34
17
Null values
List all employees that do not have a boss.
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE SUPERSSN IS NULL;
‘SUPERSSN
= NULL’ and
‘SUPERSSN <> NULL’
will not return any matching tuples,
because NULL is incomparable to any value,
including another NULL.
35
Insert data
INSERT INTO <table> (<attr>,…) VALUES ( <val>, …) ;
INSERT INTO <table> (<attr>, …) <subquery> ;
Store information about how many hours an employee
works for the project ’1' into WORKS_ON.
INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5);
Integrity constraint!
Referential integrity constraint!
36
18
Update data
UPDATE <table> SET <attr> = <val> ,…
WHERE <condition> ;
UPDATE <table> SET (<attr>, ….) = ( <subquery> )
WHERE <condition> ;
Give all employees in the ‘Research’ department a
10% raise in salary.
Integrity constraint!
Referential integrity constraint!
UPDATE EMPLOYEE
SET SALARY = SALARY*1.1
WHERE DNO IN (SELECT DNUMBER
FROM DEPARTMENT
WHERE DNAME = ‘Research’);
37
Delete data
DELETE FROM <table> WHERE <condition> ;
Delete the employees having the last name
‘Borg’ from the EMPLOYEE table.
DELETE FROM EMPLOYEE
WHERE LNAME = ‘Borg’;
EMPLOYEE
FNAME
M
LNAME
SSN
Foreign key
DEPARTMENT
DNAME
DNUMBER
MGRSSN
Ramesh
K
Narayan
666884444
Research
5
333445555
Joyce
A
English
453453453
Administration
4
987654321
Ahmad
V
Jabbar
987987987
Headquarters
1
888665555
James
E
Borg
888665555
ON DELETE SET NUL / DEFAULT / CASCADE ?
Referential integrity constraint!
38
19
Views
A virtual table derived from another (possibly
virtual) tables, i.e. always up-to-date.
CREATE
VIEW dept_view AS
SELECT DNO, COUNT(*), AVG(SALARY)
FROM EMPLOYEE
GROUP BY DNO;
Why?
Simplify
query commands.
Provide data security.
Enhance programming productivity.
Update problems.
39
20
Download