Relational databases and SQL Overview He Tan

advertisement
Overview
Real world
Relational databases
and SQL
Query
Model
Database
Answer
Processing of
queries and updates
DBMS
Access to stored data
He Tan
hetan@ida.liu.se
Physical
database
2
What you will learn
„
Relational data model
… concepts,
„
Relational data model
First introduced in 1970 by E.F. Codd at the
IBM Research Laboratory in San Jose,
California
„ RDBMS, e.g. IBM's DB2, Oracle, MySQL,
Microsoft Access
„ Other models: hierarchical, network, objectrelational
„
constraints
SQL (Structured Query Language)
… query,
declare, update
3
Relational model concepts
4
Relational model concepts
Database – collection of relations
String shorter than 30 chars
Attributes
...
EMPLOYEE
Tuples
...
Domain
...
yyyy-mm-dd
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
A
English
453453453
1972-07-31
…
F
25000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
E
Borg
888665555
1937-11-10
…
M
55000
null
1
EMPLOYEE
Integer
400 < x < 8000
Character
M or F
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
Null
English
453453453
1972-07-31
…
F
38000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
Null
Borg
888665555
1937-11-10
…
M
55000
888665555
1
Relation – a set of tuples
NULL value
Relation schema
EMPLOYEE ( FNAME, M, LNAME, SSN, BDATE, ADDRESS, S, SALARY, SUPERSSN, DNO)
5
6
1
Relational model constraints
domain
constraint
String shorter than 30 chars
yyyy-mm-dd
EMPLOYEE
Relational model constraints
Integer
400 < x < 8000
Character
M or F
Foreign keys
EMPLOYEE
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
A
English
453453453
1972-07-31
…
F
38000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
E
Borg
888665555
1937-11-10
…
M
55000
null
1
FNAME
M
LNAME
SSN
BDATE
ADDRESS
S
SALARY
SUPERSSN
DNO
Ramesh
K
Narayan
666884444
1962-09-15
…
M
38000
888665555
5
Joyce
A
English
453453453
1972-07-31
…
F
25000
888665555
5
Ahmad
V
Jabbar
987987987
1969-03-29
…
M
25000
888665555
4
James
E
Borg
888665555
1937-11-10
…
M
55000
null
1
referential integrity constraint
constraints on
NULL values
Primary key
Relation – set of tuples,
(Candidate) keys
i.e. no duplicates
Superkey
entity integrity
key constraints + constraint
DEPARTMENT
DNAME
DNUMBER
MGRSSN
MGRSTARTDATE
Research
5
666884444
1988-05-22
Administration
4
987987987
1995-01-01
Headquarters
1
888665555
1981-06-19
7
8
Relational model constraints
Real world
(Atomic) domain (or NULL).
„ Key.
„ Entity integrity: PK is NOT NULL.
„ NOT NULL
„ Referential integrity: FK of R referring to S if
„
Model
Database
DBMS
Query
Answer
Processing of
queries and updates
Access to stored data
domain(FK(R))=domain(PK(S))
r.FK = s.PK for some s, otherwise NULL.
Physical
database
9
What you will learn
„
SQL
Relational data model
… concepts,
10
„
constraints
Structured Query Language
… DDL,
DML, …
(what, not how)
… Declarative
„
Developed by IBM Research as interface
to System R. (1970s, SEQUEL)
„ Used in many database systems
„
SQL (Structured Query Language)
… query,
declare, update
11
12
2
COMPANY schema
„
EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE,
ADDRESS, SEX, SALARY, SUPERSSN, DNO)
„
DEPT-LOCATIONS (DNUMBER, DLOCATION)
„
DEPARTMENT (DNAME, DNUMBER, MGRSSN,
MGRSTARTDATE)
„
WORKS-ON (ESSN, PNO, HOURS)
„
PROJECT (PNAME, PNUMBER, PLOCATION, DNUM)
„
DEPENDENT (ESSN, DEPENDENT-NAME, SEX,
BDATE, RELATIONSHIP)
Create tables
CREATE TABLE <tablename> (
<colname> <datatype> [<constraint>],
…,
Relation data model
[<constraint>],
relation
attribute
…
tuple
);
… data
Create tables
types: integer, decimal, number, varchar2 …
not null, primary key, foreign key, unique
14
Modify tables
CREATE TABLE WORKS_ON (
ESSN integer
constraint fk_works_emp
references EMPLOYEE(SSN),
PNO integer
constraint fk_works_proj
references PROJECT(PNUMBER),
HOURS decimal(3,1),
constraint pk_workson
primary key (ESSN, PNO)
);
„
Change the definition of a table: add, delete and
modify columns and constraints
ALTER TABLE EMPLOYEE ADD JOB VARCHAR2(12);
ALTER TABLE EMPLOYEE DROP COLUMN ADDRESS CASCADE;
ALTER TABLE DEPTS-INFO
DROP CONSTRAINT DInfo_Dept;
„
Delete a table and its definition
DROP TABLE EMPLOYEE;
15
Query tables
„
table-list: R1, …, Rk
„
condition: conditional (boolean) expression
16
Simple query
SELECT <attribute-list>
FROM <table-list>
WHERE <condition>;
attribute-list: R1.A1, …, Rk.Ar
column
row
… constraints:
13
„
SQL
table
„
List SSN for all employees
SELECT SSN
FROM EMPLOYEE;
Attributes whose values to be required
Relations to be queried
identifies the tuples that should be retrieved
logical operators (and, or, not)
comparison operators(=, <>, >, >=, …)
17
SSN
123456789
333445555
999887777
987654321
666884444
453453453
987987987
888665555
18
3
Use of *
„
Simple query
List all information about the employees of
department 5
„
SELECT FNAME, MINIT, LNAME,SSN, BDATE,
ADDRESS, SEX, SALARY, SUPERSSN, DNO
FROM EMPLOYEE
WHERE DNO = 5;
or
SELECT *
FROM EMPLOYEE
WHERE DNO = 5;
SELECT LNAME, BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME = ‘Alicia’
AND MINIT = ‘J’
AND LNAME = ‘Zelaya’;
Zelaya
1968-07-19
3321 Castle, Spring, TX
20
SQL considers a table as a multi-set (bag),
i.e. tuples can occur more than once in a
table
Difference wrt
Relational model
„ Why?
… Removing
duplicates is expensive
may want information about duplicates
… Aggregation operators
… User
LIKE comparison operator
%
replaces 0 or more characters
_
replaces a single character
BDATE
ADDRESS
„
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE LNAME LIKE ‘%aya%’;
Zelaya
1968-07-19
Narayan 1962-09-15
BDATE
Tables as sets
List birth date and address for all employees
whose last name contains the substring ‘aya’
LNAME
LNAME
19
Exact vs pattern matching
„
List last name, birth date and address for all
employees whose name is `Alicia J. Zelaya'
ADDRESS
3321 Castle, Spring, TX
975 Fire Oak, Humble, TX
Example
SALARY
List all salaries
SELECT SALARY
FROM EMPLOYEE;
30000
40000
25000
43000
38000
25000
25000
55000
„
21
22
Set operations
Duplicate tuples are removed.
Queries can be combined by set operations: UNION,
INTERSECT, EXCEPT (MySQL only supports UNION)
„
Retrieve all first names of all people in our mini world
SELECT FNAME FROM EMPLOYEE
UNION
SELECT DEPENDENT_NAME FROM DEPENDENT;
SALARY
„
List all salaries without duplicates.
SELECT DISTINCT SALARY
FROM EMPLOYEE;
„
30000
40000
25000
43000
38000
55000
Which department managers have dependents? Show
their SSN.
SELECT MGRSSN FROM DEPARTMENT
INTERSECT
SELECT ESSN FROM DEPENDENT;
23
24
4
Foreign key in
EMPLOYEE
Join. Cartesian product
„
List all employees and their department
SELECT LNAME, DNAME
FROM EMPLOYEE, DEPARTMENT;
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Research
Administration
headquarters
LNAME
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
DNAME
Join. Equijoin
Research
Research
Research
Research
Research
Research
Research
Research
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
„
List all employees and their
SELECT LNAME, DNAME
FROM EMPLOYEE,
DEPARTMENT
WHERE DNO = DNUMBER;
Equijoin
Thetajoin {=, <>, >, >=, =<, !=}
Result: each tuple in EMPLOYEE is combined
with each tuple in DEPARTMENT
Cartesian product
Primary key in
DEPARTMENT
LNAME DNO
DNAME
Smith
5
Wong
5
Zelaya
4
department
Wallace 4
Narayan 5
English 5
Jabbar 4
Borg
1
Smith
5
Wong
5
Zelaya 4
Wallace 4
Narayan 5
English 5
Jabbar 4
Borg
1
Smith
5
Wong
5
Zelaya
4
Wallace 4
Narayan 5
English 5
Jabbar 4
Borg
1
DNUMBER
Research
Research
Research
Research
Research
Research
Research
Research
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Administration
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
Headquarters
5
5
5
5
5
5
5
5
4
4
4
4
4
4
4
4
1
1
1
1
1
1
1
1
25
26
Ambiguous names. Aliasing
„
Join. Self-join
Same attribute name used in different relations
„
…
No alias
SELECT LNAME, DNAME
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER;
…
Whole name
SELECT EMPLOYEE.LNAME,
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.SUPERSSN = S.SSN;
DEPARTMENT.DNAME
FROM EMPLOYEE, DEPARTMENT
WHERE EMPLOYEE.DNO=
DEPARTMENT.DNUMBER;
…
Alias
List last name for all employees together with last names
of their bosses
SELECT E.LNAME, D.NAME
FROM EMPLOYEE E, DEPARTMENT D
WHERE E.DNO=D.DNUMBER;
Employee Boss
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Wong
Borg
Wallace
Borg
Wong
Wong
Wallace
27
28
Join. Outer join
Join. Inner join
Introduction
SELECT E.LNAME, E.SUPERSSN,
S.LNAME, S.SSN
FROM EMPLOYEE E, EMPLOYEE S
Cartesian
product
E.LNAME E.SUPERSSN S.LNAME S.SSN
„
„
List last name for all employees together with last
names of their bosses
List last name for all employees
together with last names of their
bosses
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E
INNER JOIN EMPLOYEE S
ON E.SUPERSSN = S.SSN;
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E, EMPLOYEE S
WHERE E.SUPERSSN = S.SSN;
„
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E INNER JOIN EMPLOYEE S
ON E.SUPERSSN = S.SSN;
„
29
Inner join: matching tuple
exists in the other relation, i.e.
an employee “Borg” is not
included in the answer
Use “outer join”
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
Smith
Wong
...
333445555
888665555
987654321
888665555
333445555
333445555
987654321
NULL
333445555
888665555
987654321
888665555
333445555
333445555
987654321
NULL
333445555
888665555
Smith
Smith
Smith
Smith
Smith
Smith
Smith
Smith
Wong
Wong
Wong
Wong
Wong
Wong
Wong
Wong
Zelaya
Zelaya
123456789
123456789
123456789
123456789
123456789
123456789
123456789
123456789
333445555
333445555
333445555
333445555
333445555
333445555
333445555
333445555
999887777
999887777
30
5
Joins – revisited
Join. Outer join
A
Cartesian product
SELECT * FROM a, b;
„
List last name for all employees
and, if available, show last names
of their bosses
Employee Boss
Smith
Wong
Zelaya
Wallace
Narayan
English
Jabbar
Borg
SELECT E.LNAME “Employee”,
S. LNAME “Boss”
FROM EMPLOYEE E LEFT JOIN EMPLOYEE S
ON E.SUPERSSN = S.SSN;
The LEFT JOIN returns all the rows from the
first table (the left side of OUTER JOIN
operator).
Wong
Borg
Wallace
Borg
Wong
Wong
Wallace
NULL
B
A1
A2
B1
B2
100
A
100
W
null
B
200
X
A2
A1
B1
B2
300
C
null
Y
A
100
100
W
null
D
null
Z
B
null
100
W
C
300
100
W
D
null
100
W
A
100
200
X
B
null
200
X
C
300
200
X
D
null
200
X
A
100
null
Y
B
null
null
Y
C
300
null
Y
D
null
null
Y
A
100
null
Z
Equijoin, natural join, inner join
SELECT * from a, b WHERE a1=b1;
A2
A1
B1
B2
A
100
100
W
Thetajoin
SELECT * from a, b WHERE a1>b1;
B
null
null
Z
A2
A1
B1
B2
C
300
null
Z
C
300
100
W
D
null
null
Z
C
300
200
X
31
Outer Joins – revisited
A
Right outer join
SELECT * FROM a RIGHT JOIN b on a1=b1;
Subqueries
B
A1
A2
B1
B2
100
A
100
W
null
B
200
X
A2
A1
B1
B2
300
C
null
Y
A
100
100
W
null
D
null
Z
null
null
200
X
null
null
null
Y
null
null
null
Z
„
SELECT * FROM a LEFT JOIN b on a1=b1;
Full outer join (union of right+left)
A2
A1
B1
B2
A
100
100
W
C
300
null
null
B
null
null
null
D
null
null
null
A2
A1
B1
B2
A
100
100
W
null
null
200
X
null
null
null
Y
null
null
null
Z
C
300
null
null
B
null
null
null
D
null
null
null
Which employees have a 10 hour (exact) project assignment?
…
SELECT * FROM a FULL JOIN b on a1=b1;
Left outer join
32
Following query returns duplicates (why?):
SELECT LNAME FROM EMPLOYEE, WORKS_ON
WHERE SSN = ESSN AND HOURS = 10.0;
{>, >=, <, <=, <>}
SELECT LNAME
+
FROM EMPLOYEE
{ANY, SOME, ALL}
WHERE SSN IN (SELECT ESSN FROM WORKS_ON
WHERE HOURS = 10.0);
Or
NOT EXISTS
SELECT LNAME
FROM EMPLOYEE
WHERE EXISTS (SELECT * FROM WORKS_ON
WHERE SSN = ESSN AND HOURS = 10.0);
33
SQL syntax – More complex
34
Aggregate functions
SELECT <attribute-list and function-list>
FROM <table-list>
[ WHERE <condition> ]
[ GROUP BY <grouping attribute-list>]
[ HAVING <group condition> ]
[ ORDER BY <attribute-list> ];
Built-in functions: AVG(), SUM(), MIN(), MAX(), COUNT()
„
List the number of employees
SELECT COUNT(*)
FROM EMPLOYEE;
appear in
SELECT and HAVING clauses!
NULL: is eliminated
50
50
100
100
Null
0
AVG() 75
35
50
36
6
Grouping
Order of query results
Used to apply an aggregate function to subgroups of tuples in a
relation
GROUP BY – grouping attributes
HAVING – condition that a group has to satisfy
„
„
Select department names and their locations in
alphabetical order.
SELECT DNAME, DLOCATION
FROM DEPARTMENT D, DEPT_LOCATIONS DL
WHERE D.DNUMBER = DL.DNUMBER
ORDER BY DNAME ASC, DLOCATION DESC;
List for each department the department number, the
number of employees and the average salary.
Separate group for
all tuples with NULL value
of the grouping attribute
SELECT DNO, COUNT(*), AVG(SALARY)
FROM EMPLOYEE
DNO COUNT(*) AVG(SALARY)
GROUP BY DNO
HAVING COUNT(*) > 2;
5
4
33250
4
1
3
1
DNAME
Administration
Headquarters
Research
Research
Research
31000
55000
DLOCATION
Stafford
Houston
Sugarland
Houston
Bellaire
37
Null values
„
38
Insert new data
INSERT INTO <table> (<attr>,…) VALUES ( <val>, …) ;
INSERT INTO <table> (<attr>, …) <subquery> ;
List all employees that do not have a boss.
Store information about how many hours an employee
works for the project ’1' into WORKS_ON
INSERT INTO WORKS_ON VALUES (123456789, 1, 32.5);
„
SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE SUPERSSN IS NULL;
‘SUPERSSN = NULL’ and
‘SUPERSSN <> NULL’
will not return any matching tuples
Integrity constraint!
Referential integtiry constraint!
39
Managing data. Modify stored data
40
Managing data. Delete stored data
DELETE FROM <table> WHERE <condition> ;
UPDATE <table> SET <attr> = <val> ,…
WHERE <condition> ;
UPDATE <table> SET (<attr>, ….) = ( <subquery> )
WHERE <condition> ;
„
Delete employees having the last name ‘Borg’
from the EMPLOYEE table
DELETE FROM EMPLOYEE
„
Give all employees in the ‘Research’ department a 10
raise in salary.
Integrity constraint!
referential integrity constraints
WHERE LNAME = ‘Borg’;
Referential integtiry constraint!
UPDATE EMPLOYEE
SET SALARY = SALARY*1.1
WHERE DNO IN (SELECT DNUMBER
FROM DEPARTMENT
WHERE DNAME = ‘Research’);
41
EMPLOYEE
Foreign key
FNAME
M
LNAME
SSN
DNAME
DNUMBER
MGRSSN
Ramesh
K
Narayan
666884444
Research
5
333445555
Joyce
A
English
453453453
Administration
4
987654321
Ahmad
V
Jabbar
987987987
Headquarters
1
888665555
James
E
Borg
888665555
DEPARTMENT
SET NULL ? SET DEFAULT ? CASCADE ?
42
7
Views
„
A virtual table derived from other – possible
virtual -- tables.
VIEW dept_view
AS SELECT DNUMBER, DNAME
FROM DEPARTMENT;
… CREATE
„
Why?
… Simplify
query commands
data security
… Enhance programming productivity
… Provide
„
Update problems
43
8
Download