数据库原理 Principles of Database System 第6章 关系数据库基础(II) Relational Database Basics(II) Textbook:Chapter 6 The Relational Algebra and Relational Calculus Review: Relational Data Model Structure:Relation(Table) A relation is defined as a set of tuples. Operations Retrieval Update Insert Delete Update(Modify) Constraints Domain Constraint Key Constraint Constraints on Null Entity Integrity Constraint Referential Integrity Constraint 2 课程教材的架构 数据库基本概念(Introduction) Chapter 1:Databases and Database Users Chapter 2:Database System Concepts and Architecture 关系数据模型(逻辑模型之一:Relational Model) Chapter 5:The Relational Data Model and Relational Database Constraints(“数据结构”和 “数据的约束条件”) Chapter 6:Relational Algebra and Relational Calculus(“数据操作”) (本讲内容) Chapter 8:SQL-99 (“数据操作”) Chapter 9:More SQL (“数据操作”) 3 课程教材的架构(续) 数据库设计 Chapter 3:Data Modeling Using the EntityRelationship Model(概念模型之一:EntityRelationship Model)(Conceptual Modeling) Chapter 7:Relational Database Design by ER- and EER-to-Relational Mapping Part 3:Database Design Theory and Methodology(“数据模型优化”的理论基础) Chapter 16:Physical Database Design and Tuning(物理数据库设计) Chapter 13:Disk Organization, Basic File Structures, and Hashing(物理数据模型) Chapter 14:Indexing Structures for Files(物理数据库设计 的主要任务) 4 本章主要内容 关系代数概述 专门的关系运算—SELECT 专门的关系运算—PROJECT 传统的集合运算—UNION、SET DIFFERENCE、 INTERSECTION、CARTESIAN PRODUCT 专门的关系运算—JOIN 专门的关系运算—DIVISION 基本运算 5 本章主要内容 6.1 一元关系操作:select和project 6.2 基于集合的关系代数操作 6.3 二元关系操作:join和division(除) 6.4 其他关系操作 6 复习:数据模型的组成要素 数据结构:对系统静态特征的描述 数据操作:对系统动态特征的描述 检索(查询) 更新(插入、删除、修改:增删改) 数据的约束条件:完整性约束规则 7 关系操作 特点:集合(set)操作方式,即操作的对 象和结果都是集合 SQL:Structured Query Language 分类:DML,DDL,DCL 8 关系代数(Relational Algebra)概述 关系代数是一种抽象的查询语言,是关系数据操纵语 言的一种传统(SQL语言的基础)表达方式,它是用对关 系的运算来表达查询的 运算:运算对象(Operand),运算符(Operator),运算 结果(Result) (3+6)*4=36 (表达式) 关系代数的运算 运算对象—》关系 运算结果—》关系 运算符 —》四类 A sequence of relational algebra operations forms a relational algebra expression(关系代数表达式), whose result will also be a relation. 9 关系代数的运算符 集合运算符(set operations from mathematical set theory) 专门的关系运算符(operations developed specifically for relational databases) 不仅涉及行而且涉及列 算术比较符 将关系看成元组的集合(each relation is defined to be a set of tuples) 运算是从关系的“水平”方向即行的角度来进行 辅助专门的关系运算符进行操作 逻辑运算符 辅助专门的关系运算符进行操作 10 11 复习:集合的性质 集合的元素是彼此不同的 集合的元素是无序的 12 本章主要内容 6.1 一元关系操作:select和project 6.2 基于集合的关系代数操作 6.3 二元关系操作:join和division 6.4 其他关系操作 13 专门的关系运算—The SELECT(Restrict) Operation The SELECT operation is used to select a subset of the tuples from a relation based on a selection condition. 表示方法 σ<selection condition>(R) the symbol σ(sigma) is used to denote the SELECT operator, and the selection condition is a Boolean expression specified on the attributes of relation R 14 Select the EMPLOYEE tuples whose department number is 4 σDNO=4(EMPLOYEE) Tuples that make the condition true are selected, tuples that make the condition false are filtered out. 15 Select the EMPLOYEE tuples whose salary is greater than $30,000 σSALARY>30000(EMPLOYEE) 16 专门的关系运算—The SELECT Operation 注意 R可以是一个基本关系(或者视图表),也可 以是其他关系代数运算的中间结果关系(以 后的运算都是如此) The relation resulting from the SELECT operation has the same attributes as R The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R 选择运算是从行的角度进行的运算 σ 17 专门的关系运算—The SELECT Operation 关于“筛选条件” The Boolean expression specified in <selection condition> is made up of a number of clauses of the form: <attribute name> <comparison op> <constant value> <attribute name> <comparison op> <attribute name>(一般用 于多个表) <attribute name> is the name of an attribute of R, <comparison op> is normally one of the operators {=, <, ≤, >, ≥ , ≠}, and <constant value> is a constant value from the attribute domain. Clauses can be arbitrarily connected by the Boolean operators AND, OR, and NOT to form a general selection condition. 18 列出部门号为4且工资大于25000或者部 门号为5且工资大于30000的员工信息。 σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE) 19 专门的关系运算—The SELECT Operation SELECT Operation Properties SELECT is commutative(可交换的): <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R)) Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order: <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R))) 20 专门的关系运算—The PROJECT Operation The PROJECT Operation denoted by π(pi) keeps certain columns (attributes) from a relation, discards the other columns and produces a new relation. 表示方法 21 List each employee’s first,last name and salary 22 专门的关系运算—The PROJECT Operation 注意 R可以是一个基本关系(或者视图表),也可以是其他 关系代数运算的中间结果关系 The result of the PROJECT operation has only the attributes specified in <attribute list> and in the same order as they appear in the list. If the attribute list includes only nonkey attributes of R, duplicate tuples are likely to occur; the PROJECT operation removes any duplicate tuples(去掉重复的元 组), so the result of the PROJECT operation is a set of tuples and hence a valid relation. This is known as duplicate elimination. 23 24 投影操作主要是从列的角度进行运算 π 25 专门的关系运算—The PROJECT Operation PROJECT Operation Properties The number of tuples in the result of projection <list>(R) is always less or equal to the number of tuples in R PROJECT is not commutative <list1> ( <list2> (R) ) = <list1> (R) as long as <list2> contains the attributes in <list1> 26 Sequences of Operations(连续运算) and the RENAME Operation(重命名操作) 示例:查询在部门5工作的员工的FIRST NAME,LAST NAME,SALARY信息 两种求解方式 单一的“关系代数表达式 ”(relational algebra expression) 创建中间结果关系(intermediate result relations) 27 单一的“关系代数表达式” 28 创建中间结果关系 29 本章主要内容 6.1 一元关系操作:select和project 6.2 基于集合的关系代数操作 6.3 二元关系操作:join和division 6.4 其他关系操作 30 传统的集合运算—UNION(并) 定义 属于关系R或属于关系S的元组组成的集合 R∪S ={ t | t∈R∨t∈S } 注意 Duplicate tuples are eliminated 两个关系R和S若进行并运算,则它们必须具有相同 的关系模式(UNION compatible) The two relations have the same number of attributes and each pair of corresponding attributes have the same domain(属性名不一定要 一致,若不一致,则以R关系的属性名为准) RS 31 举例 查询在部门5工作的人或直接领导在部门 5工作员工的人的社会保险号信息 32 查询在部门5工作的人或直接领导在部门5工作员工 的人的社会保险号信息 33 列出部门号为4且工资大于25000或者部 门号为5且工资大于30000的员工信息。 σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE) 34 传统的集合运算—差(SET DIFFERENCE, also called MINUS or EXCEPT) 定义 由属于关系R而不属于关系S的元组组成的集 合 R-S ={ t | t∈R∧t S } RS 35 传统的集合运算—交(INTERSECTION) 定义 既属于关系R又属于关系S的元组组成的集合 R∩S ={ t | t∈R∧t∈S } R∩S=R-(R-S) 36 37 Some properties of UNION, INTERSECT, and DIFFERENCE Both union and intersection are commutative operations; that is Both union and intersection are associative(可 结合的) operations; that is R S = S R, and R S = S R R (S T) = (R S) T (R S) T = R (S T) The minus operation is not commutative; that is, in general R–S≠S–R 38 传统的集合运算—笛卡尔积(CARTESIAN PRODUCT,CROSS PRODUCT or CROSS JOIN ) The result of R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is a relation Q with n + m attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple for each combination of tuples—one from R and one from S. Hence, if R has nR tuples and S has nS tuples, then R x S will have nR * nS tuples. 39 40 示例:查询有家属的女性员工姓名及 其家属姓名 41 42 43 44 本章主要内容 6.1 一元关系操作:select和project 6.2 基于集合的关系代数操作 6.3 二元关系操作:join和division 6.4 其他关系操作 45 专门的关系运算—The JOIN Operation 主要是用于将两个有联系的关系连接起来 This operation is very important for any relational database with more than a single relation, because it allows us to process relationships among relations. 表示方法 The general form of a JOIN operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is: 46 示例:查询有家属的女性员工姓名及 其家属姓名 47 48 The main difference between CARTESIAN PRODUCT and JOIN: in JOIN, only combinations of tuples satisfying the join condition appear in the result, whereas in the CARTESIAN PRODUCT all combinations of tuples are included in the result. 49 <join condition> A general join condition is of the form: <condition> AND <condition> AND . . . AND <condition> each condition is of the form AiθBj, Ai is an attribute of R, Bj is an attribute of S, Ai and Bj have the same domain, and θ(theta) is one of the comparison operators {=, <,≤ , >,≥ ,≠ }. A JOIN operation with such a general join condition is called a THETA JOIN. 50 R S Compute the theta-join of R and S with the condition R.A < S.C AND R.B < S.D. 51 等值连接(EQUIJOIN) θ为“=”的连接运算 例如: To retrieve the name of the manager of each department 52 53 自然连接(NATURAL JOIN) 举例: 54 专门的关系运算—The DIVISION Operation 示例:查询参与了“John Smith”参与的 所有项目的员工姓名 首先,查询出 “John Smith”参与的所有项 目的编号,将此中间结果关系命名为 SMITH_PNOS: 55 56 下一步,查出所有员工参与项目列表: 57 最后,运用DIVISION运算于前两个关系, 即得到结果: 58 Result中只包括R中S所没有的属性组 59 60 ÷ = 61 A Complete Set of Relational Algebra Operations The set of relational algebra operations {∪,-,×, σ,π} is a complete set(完备集) . 在关系代数运算中,集合的并运算、差运算、 笛卡尔积运算以及选择运算和投影运算是五种 基本运算,另三种运算(集合的交运算以及连 接运算和除运算)可以用五种基本运算来表达, 引进它们并不增加语言的能力,但是可以简化 表达。 62 第6章 课后作业 复习教材6.1、6.2、6.3 思考:6.2 预习教材第7章 思考:What is the difference between Relational Algebra’s SELECT and SQL’s SELECT? 书面作业 6.16(参考6.5) 63