数据库原理

advertisement
数据库原理
Principles of Database System
第6章 关系数据库基础(II)
Relational Database Basics(II)
Textbook:Chapter 6 The Relational Algebra and
Relational Calculus
Review: Relational Data Model

Structure:Relation(Table)


A relation is defined as a set of tuples.
Operations


Retrieval
Update




Insert
Delete
Update(Modify)
Constraints





Domain Constraint
Key Constraint
Constraints on Null
Entity Integrity Constraint
Referential Integrity Constraint
2
课程教材的架构

数据库基本概念(Introduction)



Chapter 1:Databases and Database Users
Chapter 2:Database System Concepts and
Architecture
关系数据模型(逻辑模型之一:Relational
Model)




Chapter 5:The Relational Data Model and
Relational Database Constraints(“数据结构”和
“数据的约束条件”)
Chapter 6:Relational Algebra and Relational
Calculus(“数据操作”) (本讲内容)
Chapter 8:SQL-99 (“数据操作”)
Chapter 9:More SQL (“数据操作”)
3
课程教材的架构(续)

数据库设计




Chapter 3:Data Modeling Using the EntityRelationship Model(概念模型之一:EntityRelationship Model)(Conceptual Modeling)
Chapter 7:Relational Database Design by ER- and
EER-to-Relational Mapping
Part 3:Database Design Theory and
Methodology(“数据模型优化”的理论基础)
Chapter 16:Physical Database Design and
Tuning(物理数据库设计)


Chapter 13:Disk Organization, Basic File Structures, and
Hashing(物理数据模型)
Chapter 14:Indexing Structures for Files(物理数据库设计
的主要任务)
4
本章主要内容







关系代数概述
专门的关系运算—SELECT
专门的关系运算—PROJECT
传统的集合运算—UNION、SET DIFFERENCE、
INTERSECTION、CARTESIAN PRODUCT
专门的关系运算—JOIN
专门的关系运算—DIVISION
基本运算
5
本章主要内容




6.1 一元关系操作:select和project
6.2 基于集合的关系代数操作
6.3 二元关系操作:join和division(除)
6.4 其他关系操作
6
复习:数据模型的组成要素


数据结构:对系统静态特征的描述
数据操作:对系统动态特征的描述



检索(查询)
更新(插入、删除、修改:增删改)
数据的约束条件:完整性约束规则
7
关系操作

特点:集合(set)操作方式,即操作的对
象和结果都是集合
SQL:Structured Query Language
分类:DML,DDL,DCL
8
关系代数(Relational Algebra)概述



关系代数是一种抽象的查询语言,是关系数据操纵语
言的一种传统(SQL语言的基础)表达方式,它是用对关
系的运算来表达查询的
运算:运算对象(Operand),运算符(Operator),运算
结果(Result)
 (3+6)*4=36
(表达式)
关系代数的运算
 运算对象—》关系
 运算结果—》关系
 运算符
—》四类
 A sequence of relational algebra operations forms
a relational algebra expression(关系代数表达式),
whose result will also be a relation.
9
关系代数的运算符

集合运算符(set operations from mathematical
set theory)



专门的关系运算符(operations developed
specifically for relational databases)


不仅涉及行而且涉及列
算术比较符


将关系看成元组的集合(each relation is defined to
be a set of tuples)
运算是从关系的“水平”方向即行的角度来进行
辅助专门的关系运算符进行操作
逻辑运算符

辅助专门的关系运算符进行操作
10
11
复习:集合的性质


集合的元素是彼此不同的
集合的元素是无序的
12
本章主要内容




6.1 一元关系操作:select和project
6.2 基于集合的关系代数操作
6.3 二元关系操作:join和division
6.4 其他关系操作
13
专门的关系运算—The SELECT(Restrict)
Operation


The SELECT operation is used to select a subset of
the tuples from a relation based on a selection
condition.
表示方法


σ<selection condition>(R)
the symbol σ(sigma) is used to denote the
SELECT operator, and the selection condition
is a Boolean expression specified on the
attributes of relation R
14
Select the EMPLOYEE tuples whose
department number is 4
σDNO=4(EMPLOYEE)
Tuples that make the condition true are selected,
tuples that make the condition false are filtered out.
15
Select the EMPLOYEE tuples whose
salary is greater than $30,000
σSALARY>30000(EMPLOYEE)
16
专门的关系运算—The SELECT Operation
 注意




R可以是一个基本关系(或者视图表),也可
以是其他关系代数运算的中间结果关系(以
后的运算都是如此)
The relation resulting from the SELECT
operation has the same attributes as R
The number of tuples in the result of a SELECT
is less than (or equal to) the number of tuples in
the input relation R
选择运算是从行的角度进行的运算
σ
17
专门的关系运算—The SELECT Operation

关于“筛选条件”

The Boolean expression specified in <selection
condition> is made up of a number of clauses of the
form:




<attribute name> <comparison op> <constant value>
<attribute name> <comparison op> <attribute name>(一般用
于多个表)
<attribute name> is the name of an attribute of R,
<comparison op> is normally one of the operators {=,
<, ≤, >, ≥ , ≠}, and <constant value> is a constant value
from the attribute domain.
Clauses can be arbitrarily connected by the Boolean
operators AND, OR, and NOT to form a general
selection condition.
18
列出部门号为4且工资大于25000或者部
门号为5且工资大于30000的员工信息。
σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
19
专门的关系运算—The SELECT Operation

SELECT Operation Properties
 SELECT  is commutative(可交换的):


<condition1>( < condition2>
(R)) = 
<condition2>
(
< condition1>
(R))
Because of commutativity property, a cascade
(sequence) of SELECT operations may be applied in
any order:



<cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R)))
A cascade of SELECT operations may be replaced by a
single selection with a conjunction of all the conditions:

<cond1>(< cond2> (<cond3>(R)) =  <cond1> AND < cond2> AND < cond3>(R)))
20
专门的关系运算—The PROJECT Operation


The PROJECT Operation denoted by
π(pi) keeps certain columns (attributes)
from a relation, discards the other
columns and produces a new relation.
表示方法
21
List each employee’s first,last name
and salary
22
专门的关系运算—The PROJECT Operation

注意



R可以是一个基本关系(或者视图表),也可以是其他
关系代数运算的中间结果关系
The result of the PROJECT operation has only the
attributes specified in <attribute list> and in the same
order as they appear in the list.
If the attribute list includes only nonkey attributes of R,
duplicate tuples are likely to occur; the PROJECT
operation removes any duplicate tuples(去掉重复的元
组), so the result of the PROJECT operation is a set of
tuples and hence a valid relation. This is known as
duplicate elimination.
23
24

投影操作主要是从列的角度进行运算
π
25
专门的关系运算—The PROJECT Operation

PROJECT Operation Properties


The number of tuples in the result of
projection <list>(R) is always less or equal
to the number of tuples in R
PROJECT is not commutative

 <list1> ( <list2> (R) ) =  <list1> (R) as long as
<list2> contains the attributes in <list1>
26
Sequences of Operations(连续运算) and
the RENAME Operation(重命名操作)

示例:查询在部门5工作的员工的FIRST
NAME,LAST NAME,SALARY信息
 两种求解方式
 单一的“关系代数表达式
”(relational algebra expression)
 创建中间结果关系(intermediate result
relations)
27
单一的“关系代数表达式”
28
创建中间结果关系
29
本章主要内容




6.1 一元关系操作:select和project
6.2 基于集合的关系代数操作
6.3 二元关系操作:join和division
6.4 其他关系操作
30
传统的集合运算—UNION(并)

定义



属于关系R或属于关系S的元组组成的集合
R∪S ={ t | t∈R∨t∈S }
注意



Duplicate tuples are eliminated
两个关系R和S若进行并运算,则它们必须具有相同
的关系模式(UNION compatible)
The two relations have the same number of
attributes and each pair of corresponding
attributes have the same domain(属性名不一定要
一致,若不一致,则以R关系的属性名为准)
RS
31
举例

查询在部门5工作的人或直接领导在部门
5工作员工的人的社会保险号信息
32
查询在部门5工作的人或直接领导在部门5工作员工
的人的社会保险号信息
33
列出部门号为4且工资大于25000或者部
门号为5且工资大于30000的员工信息。
σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
34
传统的集合运算—差(SET DIFFERENCE,
also called MINUS or EXCEPT)

定义


由属于关系R而不属于关系S的元组组成的集
合
R-S ={ t | t∈R∧t  S }
RS
35
传统的集合运算—交(INTERSECTION)

定义



既属于关系R又属于关系S的元组组成的集合
R∩S ={ t | t∈R∧t∈S }
R∩S=R-(R-S)
36
37
Some properties of UNION,
INTERSECT, and DIFFERENCE

Both union and intersection are commutative
operations; that is


Both union and intersection are associative(可
结合的) operations; that is



R  S = S  R, and R  S = S  R
R  (S  T) = (R  S)  T
(R  S)  T = R  (S  T)
The minus operation is not commutative; that
is, in general

R–S≠S–R
38
传统的集合运算—笛卡尔积(CARTESIAN
PRODUCT,CROSS PRODUCT or CROSS JOIN )

The result of R(A1, A2, . . ., An) x S(B1, B2, . . .,
Bm) is a relation Q with n + m attributes Q(A1,
A2, . . ., An, B1, B2, . . ., Bm), in that order. The
resulting relation Q has one tuple for each
combination of tuples—one from R and one
from S. Hence, if R has nR tuples and S has nS
tuples, then R x S will have nR * nS tuples.
39
40
示例:查询有家属的女性员工姓名及
其家属姓名
41

42
43

44
本章主要内容




6.1 一元关系操作:select和project
6.2 基于集合的关系代数操作
6.3 二元关系操作:join和division
6.4 其他关系操作
45
专门的关系运算—The JOIN Operation

主要是用于将两个有联系的关系连接起来


This operation is very important for any
relational database with more than a single
relation, because it allows us to process
relationships among relations.
表示方法

The general form of a JOIN operation on
two relations R(A1, A2, . . ., An) and S(B1,
B2, . . ., Bm) is:
46
示例:查询有家属的女性员工姓名及
其家属姓名

47

48

The main difference between
CARTESIAN PRODUCT and JOIN: in
JOIN, only combinations of tuples
satisfying the join condition appear in
the result, whereas in the CARTESIAN
PRODUCT all combinations of tuples are
included in the result.
49
<join condition>

A general join condition is of the form:


<condition> AND <condition> AND . . . AND
<condition>
each condition is of the form AiθBj, Ai is an
attribute of R, Bj is an attribute of S, Ai and Bj
have the same domain, and θ(theta) is one of
the comparison operators {=, <,≤ , >,≥ ,≠ }. A
JOIN operation with such a general join
condition is called a THETA JOIN.
50
R
S
Compute the theta-join of R and S with the condition R.A <
S.C AND R.B < S.D.
51
等值连接(EQUIJOIN)


θ为“=”的连接运算
例如:

To retrieve the name of the manager of
each department
52

53
自然连接(NATURAL JOIN)

举例:
54
专门的关系运算—The DIVISION Operation

示例:查询参与了“John Smith”参与的
所有项目的员工姓名

首先,查询出 “John Smith”参与的所有项
目的编号,将此中间结果关系命名为
SMITH_PNOS:
55

56

下一步,查出所有员工参与项目列表:
57

最后,运用DIVISION运算于前两个关系,
即得到结果:
58
Result中只包括R中S所没有的属性组
59
60
÷
=
61
A Complete Set of Relational Algebra
Operations
 The set of relational algebra
operations {∪,-,×, σ,π} is a
complete set(完备集) .
 在关系代数运算中,集合的并运算、差运算、
笛卡尔积运算以及选择运算和投影运算是五种
基本运算,另三种运算(集合的交运算以及连
接运算和除运算)可以用五种基本运算来表达,
引进它们并不增加语言的能力,但是可以简化
表达。
62
第6章 课后作业



复习教材6.1、6.2、6.3
 思考:6.2
预习教材第7章
 思考:What is the difference between
Relational Algebra’s SELECT and SQL’s
SELECT?
书面作业
 6.16(参考6.5)
63
Download