A (r )

advertisement
Das Bild kann zurzeit nicht angezeigt werden.
Introduction
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Database Management System (DBMS)
n DBMS contains information about a particular enterprise
l
Collection of interrelated data
l
Set of programs to access the data
l
An environment that is both convenient and efficient to use
n Database Applications:
l
Banking: transactions
l
Airlines: reservations, schedules
l
Universities: registration, grades
l
Sales: customers, products, purchases
l
Online retailers: order tracking, customized recommendations
l
Manufacturing: production, inventory, orders, supply chain
l
Human resources: employee records, salaries, tax deductions
Database System Concepts - 6th Edition
1.2
©Silberschatz, Korth and Sudarshan
History of Database Systems (1)
n 1950s and early 1960s:
l
Data processing using magnetic tapes for storage
4Tapes
l
provided only sequential access
Punched cards for input
n In the early days, database applications were built
directly on top of file systems
Database System Concepts - 6th Edition
1.3
©Silberschatz, Korth and Sudarshan
Drawbacks of using file systems to store data
n In the early days, database applications were built directly on top of
file systems
l
Data redundancy and inconsistency
4
l
Multiple file formats, duplication of information in different files
Difficulty in accessing data
4
Need to write a new program to carry out each new task
l
Data isolation — multiple files and formats
l
Integrity problems
4
Integrity constraints (e.g., account balance > 0) become
“buried” in program code rather than being stated explicitly
4
Hard to add new constraints or change existing ones
Database System Concepts - 6th Edition
1.4
©Silberschatz, Korth and Sudarshan
Drawbacks of using file systems to store data (Cont.)
l
l
Atomicity of updates
4
Failures may leave database in an inconsistent state with partial updates
carried out
4
Example: Transfer of funds from one account to another should either
complete or not happen at all
Concurrent access by multiple users
4
Concurrent access needed for performance
4
Uncontrolled concurrent accesses can lead to inconsistencies
– Example: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
l
Security problems
4
Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Database System Concepts - 6th Edition
1.5
©Silberschatz, Korth and Sudarshan
Levels of Abstraction in a DBMS
n Physical level: describes how a record
(e.g., customer) is stored.
n Logical level: describes data stored in
database, and the relationships among
the data.
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
n View level: application programs hide
details of data types. Views can also hide
information (such as an employee’s
salary) for security purposes.
Database System Concepts - 6th Edition
1.6
©Silberschatz, Korth and Sudarshan
History of Database Systems (2)
n Late 1960s and 1970s:
l
Hard disks allowed direct access to data
l
Network and hierarchical data models in widespread use
l
Ted Codd defines the relational data model
l
4
Would win the ACM Turing Award for this work
4
IBM Research begins System R prototype
4
UC Berkeley begins Ingres prototype
High-performance (for the era) transaction processing
Database System Concepts - 6th Edition
1.7
©Silberschatz, Korth and Sudarshan
Data Models
n A collection of tools for describing
l
Data
l Data relationships
l Data semantics
l Data constraints
n Relational model
n Entity-Relationship data model (mainly for database design)
n Object-based data models (Object-oriented and Object-relational)
n Semistructured data model (XML)
n Other older models:
l
l
Network model
Hierarchical model
Database System Concepts - 6th Edition
1.8
©Silberschatz, Korth and Sudarshan
Relational Model
n Example of tabular data in the relational model
Columns
Rows
Database System Concepts - 6th Edition
1.9
©Silberschatz, Korth and Sudarshan
A Sample Relational Database
Database System Concepts - 6th Edition
1.10
©Silberschatz, Korth and Sudarshan
Data Definition Language (DDL)
n
Language for accessing and manipulating the data organized by the
appropriate data model
n
Specification notation for defining the database schema
Example:
create table instructor (
ID
char(5),
name
varchar(20),
dept_name varchar(20),
salary
numeric(8,2))
n
DDL compiler generates a set of table templates stored in a data dictionary
n
Data dictionary contains metadata (i.e., data about data)
l
Database schema
l
Integrity constraints
4
Primary key (ID uniquely identifies instructors)
4
Referential integrity (references constraint in SQL)
– e.g. dept_name value in any instructor tuple must appear in
department relation
l
Authorization
Database System Concepts - 6th Edition
1.11
©Silberschatz, Korth and Sudarshan
SQL
n SQL: widely used non-procedural language
l
Example: Find the name of the instructor with ID 22222
select name
from
instructor
where instructor.ID = ‘22222’
l Example: Find the ID and building of instructors in the Physics dept.
select instructor.ID, department.building
from instructor, department
where instructor.dept_name = department.dept_name and
department.dept_name = ‘Physics’
n Application programs generally access databases through one of
l
Language extensions to allow embedded SQL
l
Application program interface (e.g., ODBC/JDBC) which allow SQL
queries to be sent to a database
Database System Concepts - 6th Edition
1.12
©Silberschatz, Korth and Sudarshan
Modes of access to DBMS
Database System Concepts - 6th Edition
1.13
©Silberschatz, Korth and Sudarshan
Application Programs and User Interfaces
n Most database users do not use a query language like SQL
n An application program acts as the intermediary between users and
the database
l
Applications split into
4
front-end
4
middle layer
4
backend
n Front-end: user interface
l
Forms
l
Graphical user interfaces
l
Many interfaces are Web-based
Database System Concepts - 6th Edition
1.14
©Silberschatz, Korth and Sudarshan
Application Architecture Evolution
n Three distinct era’s of application architecture
l
mainframe (1960’s and 70’s)
l
personal computer era (1980’s)
l
Web era (1990’s onwards)
Database System Concepts - 6th Edition
1.15
©Silberschatz, Korth and Sudarshan
Application Architecture at Web era
model-view-controller (MVC) architecture
model: business logic
view: presentation of data, depends
on display device
controller: receives events, executes
actions, and returns a view to the user
data access layer
interfaces between business logic
layer and the underlying database
provides mapping from object model
of business layer to relational model of
database
Database System Concepts - 6th Edition
1.16
©Silberschatz, Korth and Sudarshan
Database Design
The process of designing the general structure of the database:
n Logical Design – Deciding on the database schema. Database design
requires that we find a “good” collection of relation schemas.
l
Business decision – What attributes should we record in the
database?
l
Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the various
relation schemas?
n Physical Design – Deciding on the physical layout of the database
Database System Concepts - 6th Edition
1.17
©Silberschatz, Korth and Sudarshan
Design Approaches
n Entity Relationship Model
l
Models an enterprise as a collection of entities and relationships
4
Entity: a “thing” or “object” in the enterprise that is
distinguishable from other objects
– Described by a set of attributes
4
l
Relationship: an association among several entities
Represented diagrammatically by an entity-relationship diagram:
n Normalization Theory
l
Formalize what designs are bad, and test for them
Database System Concepts - 6th Edition
1.18
©Silberschatz, Korth and Sudarshan
Database Design?
n Is there any problem with this design?
Database System Concepts - 6th Edition
1.19
©Silberschatz, Korth and Sudarshan
The Entity-Relationship Model
n Models an enterprise as a collection of entities and relationships
l
Entity: a “thing” or “object” in the enterprise that is distinguishable
from other objects
4
l
Described by a set of attributes
Relationship: an association among several entities
n Represented diagrammatically by an entity-relationship diagram:
What happened to dept_name of instructor and student?
Database System Concepts - 6th Edition
1.20
©Silberschatz, Korth and Sudarshan
History (3)
n 1980s:
l
Research relational prototypes evolve into commercial systems
4 SQL becomes industrial standard
l Parallel and distributed database systems
l Object-oriented database systems
n 1990s:
l Large decision support and data-mining applications
l Large multi-terabyte data warehouses
l Emergence of Web commerce
n Early 2000s:
l XML and XQuery standards
l Automated database administration
n Later 2000s:
l Giant data storage systems
4 Google BigTable, Yahoo PNuts, Amazon, ..
Database System Concepts - 6th Edition
1.21
©Silberschatz, Korth and Sudarshan
End of Introduction
Database System Concepts - 6th Edition
1.22
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
Relational Model
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Example of a Relation
attributes
(or columns)
tuples
(or rows)
Database System Concepts - 6th Edition
2.2
©Silberschatz, Korth and Sudarshan
Attribute Types
n The set of allowed values for each attribute is called the domain
of the attribute
n Attribute values are (normally) required to be atomic; that is,
indivisible
n The special value null is a member of every domain
l
The null value causes complications in the definition of many
operations
Database System Concepts - 6th Edition
2.3
©Silberschatz, Korth and Sudarshan
Relation Schema and Instance
n A1, A2, …, An are attributes
n R = (A1, A2, …, An ) is a relation schema
Example:
instructor = (ID, name, dept_name, salary)
n Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Î Di
n The current values (relation instance) of a relation are specified by
a table
n An element t of r is a tuple, represented by a row in a table
Database System Concepts - 6th Edition
2.4
©Silberschatz, Korth and Sudarshan
Relations are Unordered
n Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
n Example: instructor relation with unordered tuples
n Tuples are not repeated (appear only once)
nExample: The two demonstrations of instructor relation are equivalent
Database System Concepts - 6th Edition
2.5
©Silberschatz, Korth and Sudarshan
Database
n A database consists of multiple relations
n Information about an enterprise (e.g., University) is broken up into parts
instructor
student
advisor
n Bad design:
univ (instructor -ID, name, dept_name, salary, student_Id, ..)
results in
l
repetition of information (e.g., two students have the same instructor)
l
the need for null values (e.g., represent a student with no advisor)
n Normalization theory deals with how to design “good” relational schemas
Database System Concepts - 6th Edition
2.6
©Silberschatz, Korth and Sudarshan
Keys
n R = (A1, A2, …, An ) is a relation schema
n Let K Í R
n K is a superkey of R if values for K are sufficient to identify a unique
tuple of each possible relation r(R)
l
Example: {ID} and {ID,name} are both superkeys of instructor.
n Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for Instructor
n One of the candidate keys is selected to be the primary key.
l
which one? chosen by the database designer
l
its attributes should never, or very rarely, change
n Foreign key constraint: Value in one relation must appear in another
l
Referencing relation
l
Referenced relation
Database System Concepts - 6th Edition
2.7
©Silberschatz, Korth and Sudarshan
Schema Diagram for University Database
Database System Concepts - 6th Edition
2.8
©Silberschatz, Korth and Sudarshan
Relational Query Languages
n How to retrieve the entries of a database?
n Procedural vs.non-procedural, or declarative
n “Pure” languages:
l
Relational algebra
l
Tuple relational calculus
l
Domain relational calculus
n Relational operators
Database System Concepts - 6th Edition
2.9
©Silberschatz, Korth and Sudarshan
Selection of tuples
n Relation r
n Select tuples with A=B
and D > 5
nσ
A=B and D > 5
Database System Concepts - 6th Edition
(r)
2.10
©Silberschatz, Korth and Sudarshan
Selection of Columns (Attributes)
n Relation r:
n Select A and C
nProjection
nΠ
A, C
(r)
Database System Concepts - 6th Edition
2.11
©Silberschatz, Korth and Sudarshan
Joining two relations – Cartesian Product
n Relations r, s:
n r x s:
Database System Concepts - 6th Edition
2.12
©Silberschatz, Korth and Sudarshan
Union of two relations
n Relations r, s:
n r È s:
Database System Concepts - 6th Edition
2.13
©Silberschatz, Korth and Sudarshan
Set difference of two relations
n Relations r, s:
n r – s:
Database System Concepts - 6th Edition
2.14
©Silberschatz, Korth and Sudarshan
Set Intersection of two relations
n Relation r, s:
n rÇs
Database System Concepts - 6th Edition
2.15
©Silberschatz, Korth and Sudarshan
Joining two relations – Natural Join
n Let r and s be relations on schemas R and S respectively.
Then, the “natural join” of relations R and S is a relation on
schema R È S obtained as follows:
l
Consider each pair of tuples tr from r and ts from s.
l
If tr and ts have the same value on each of the attributes
in R Ç S, add a tuple t to the result, where
4
t has the same value as tr on r
4
t has the same value as ts on s
Database System Concepts - 6th Edition
2.16
©Silberschatz, Korth and Sudarshan
Natural Join Example
n Relations r, s:
n Natural Join
n r
s
Database System Concepts - 6th Edition
2.17
©Silberschatz, Korth and Sudarshan
Figure in-2.1
Database System Concepts - 6th Edition
2.18
©Silberschatz, Korth and Sudarshan
Schema Diagram for University Database
Database System Concepts - 6th Edition
2.19
©Silberschatz, Korth and Sudarshan
Instructor
Database System Concepts - 6th Edition
2.20
©Silberschatz, Korth and Sudarshan
Course
Database System Concepts - 6th Edition
2.21
©Silberschatz, Korth and Sudarshan
Prereq
Database System Concepts - 6th Edition
2.22
©Silberschatz, Korth and Sudarshan
Department
Database System Concepts - 6th Edition
2.23
©Silberschatz, Korth and Sudarshan
Section
Database System Concepts - 6th Edition
2.24
©Silberschatz, Korth and Sudarshan
Takes
Database System Concepts - 6th Edition
2.25
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
End
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Das Bild kann zurzeit nicht angezeigt werden.
Simple SQL
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Simple SQL
n Overview of the SQL Query Language
n Data Definition
n Basic Query Structure
n Additional Basic Operations
n Set Operations
n Null Values
n Aggregate Functions
n Nested Subqueries
n Modification of the Database
Database System Concepts - 6th Edition
3.2
©Silberschatz, Korth and Sudarshan
History
n IBM Sequel language developed as part of System R project at
the IBM San Jose Research Laboratory
n Renamed Structured Query Language (SQL)
n ANSI and ISO standard SQL:
l
SQL-86, SQL-89, SQL-92
l
SQL:1999, SQL:2003, SQL:2008
n Commercial systems offer most, if not all, SQL-92 features,
plus varying feature sets from later standards and special
proprietary features.
l
Not all examples here may work on your particular system.
Database System Concepts - 6th Edition
3.3
©Silberschatz, Korth and Sudarshan
Data Definition Language
The SQL data-definition language (DDL) allows the
specification of information about relations, including:
n The schema for each relation.
n The domain of values associated with each attribute.
n Integrity constraints
n And as we will see later, also other information such as
l
The set of indices to be maintained for each relations.
l
Security and authorization information for each relation.
l
The physical storage structure of each relation on disk.
Database System Concepts - 6th Edition
3.4
©Silberschatz, Korth and Sudarshan
Domain Types in SQL
n char(n). Fixed length character string, with user-specified length n.
n varchar(n). Variable length character strings, with user-specified
n
n
n
n
n
maximum length n.
int. Integer (a finite subset of the integers that is machinedependent).
smallint. Small integer (a machine-dependent subset of the integer
domain type).
numeric(p,d). Fixed point number, with user-specified precision of
p digits, with n digits to the right of decimal point.
real, double precision. Floating point and double-precision floating
point numbers, with machine-dependent precision.
float(n). Floating point number, with user-specified precision of at
least n digits.
Database System Concepts - 6th Edition
3.5
©Silberschatz, Korth and Sudarshan
Create Table Construct
n An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
l r is the name of the relation
l each Ai is an attribute name in the schema of relation r
l Di is the data type of values in the domain of attribute Ai
n Example:
create table instructor (
ID
char(5),
name
varchar(20) not null,
dept_name varchar(20),
salary
numeric(8,2))
n insert into instructor values (‘10211’, ’Smith’, ’Biology’, 66000);
n insert into instructor values (‘10211’, null, ’Biology’, 66000);
Database System Concepts - 6th Edition
3.6
©Silberschatz, Korth and Sudarshan
Integrity Constraints in Create Table
n not null
n primary key (A1, ..., An )
n foreign key (Am, ..., An ) references r
Example: Declare dept_name as the primary key for department
.
create table instructor (
ID
char(5),
name
varchar(20) not null,
dept_name varchar(20),
salary
numeric(8,2),
primary key (ID),
foreign key (dept_name) references department)
primary key declaration on an attribute automatically ensures not null
Database System Concepts - 6th Edition
3.7
©Silberschatz, Korth and Sudarshan
And a Few More Relation Definitions
n create table student (
ID
varchar(5),
name
varchar(20) not null,
dept_name
varchar(20),
tot_cred
numeric(3,0),
primary key (ID),
foreign key (dept_name) references department) );
n create table takes (
ID
varchar(5),
course_id
varchar(8),
sec_id
varchar(8),
semester
varchar(6),
year
numeric(4,0),
grade
varchar(2),
primary key (ID, course_id, sec_id, semester, year),
foreign key (ID) references student,
foreign key (course_id, sec_id, semester, year) references section );
l
Note: sec_id can be dropped from primary key above, to ensure a
student cannot be registered for two sections of the same course in the
same semester
Database System Concepts - 6th Edition
3.8
©Silberschatz, Korth and Sudarshan
And more still
n create table course (
course_id
varchar(8) primary key,
title
varchar(50),
dept_name
varchar(20),
credits
numeric(2,0),
foreign key (dept_name) references department) );
l
Primary key declaration can be combined with attribute
declaration as shown above
Database System Concepts - 6th Edition
3.9
©Silberschatz, Korth and Sudarshan
Drop and Alter Table Constructs
n drop table student
Deletes the table and its contents
n delete from student
l Deletes all contents of table, but retains table
l
n alter table
l
alter table r add A D
4 where A is the name of the attribute to be added to
relation r and D is the domain of A.
4 All tuples in the relation are assigned null as the value
for the new attribute.
l
alter table r drop A
4 where
A is the name of an attribute of relation r
4 Dropping
of attributes not supported by many
databases
Database System Concepts - 6th Edition
3.10
©Silberschatz, Korth and Sudarshan
Basic Query Structure
n The SQL data-manipulation language (DML) provides the
ability to query information, and insert, delete and update
tuples
n A typical SQL query has the form:
select A1, A2, ..., An
from r1, r2, ..., rm
where P
l Ai represents an attribute
l Ri represents a relation
l P is a predicate.
n The result of an SQL query is a relation.
Database System Concepts - 6th Edition
3.11
©Silberschatz, Korth and Sudarshan
The select Clause
n The select clause list the attributes desired in the result of a query
l
corresponds to the projection operation of the relational algebra
n Example: find the names of all instructors:
select name
from instructor
n NOTE: SQL names are case insensitive (i.e., you may use upper- or
lower-case letters.)
l
E.g. Name ≡ NAME ≡ name
l
Some people use upper case wherever we use bold font.
Database System Concepts - 6th Edition
3.12
©Silberschatz, Korth and Sudarshan
The select Clause (Cont.)
n SQL allows duplicates in relations as well as in query results.
n To force the elimination of duplicates, insert the keyword distinct
after select.
n Find the names of all departments with instructor, and remove
duplicates
select distinct dept_name
from instructor
n The keyword all specifies that duplicates not be removed.
select all dept_name
from instructor
Database System Concepts - 6th Edition
3.13
©Silberschatz, Korth and Sudarshan
The select Clause (Cont.)
n An asterisk in the select clause denotes “all attributes”
select *
from instructor
n The select clause can contain arithmetic expressions involving
the operation, +, –, *, and /, and operating on constants or
attributes of tuples.
n The query:
select ID, name, salary/12
from instructor
would return a relation that is the same as the instructor relation,
except that the value of the attribute salary is divided by 12.
Database System Concepts - 6th Edition
3.14
©Silberschatz, Korth and Sudarshan
The where Clause
n The where clause specifies conditions that the result must
satisfy
l
Corresponds to the selection predicate of the relational
algebra.
n To find all instructors in Comp. Sci. dept with salary > 80000
select name
from instructor
where dept_name = ‘Comp. Sci.' and salary > 80000
n Comparison results can be combined using the logical
connectives and, or, and not.
n Comparisons can be applied to results of arithmetic expressions.
Database System Concepts - 6th Edition
3.15
©Silberschatz, Korth and Sudarshan
The from Clause
n The from clause lists the relations involved in the query
l
Corresponds to the Cartesian product operation of the
relational algebra.
n Find the Cartesian product instructor X teaches
select *
from instructor, teaches
l
generates every possible instructor – teaches pair, with all
attributes from both relations
n Cartesian product not very useful directly, but useful combined
with where-clause condition (selection operation in relational
algebra)
Database System Concepts - 6th Edition
3.16
©Silberschatz, Korth and Sudarshan
Cartesian Product: instructor X teaches
instructor
Database System Concepts - 6th Edition
teaches
3.17
©Silberschatz, Korth and Sudarshan
Joins
n For all instructors who have taught some course, find their names
and the course ID of the courses they taught.
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID
n Find the course ID, semester, year and title of each course offered
by the Comp. Sci. department
select section.course_id, semester, year, title
from section, course
where section.course_id = course.course_id and
dept_name = ‘Comp. Sci.'
Database System Concepts - 6th Edition
3.18
©Silberschatz, Korth and Sudarshan
Natural Join
n Natural join matches tuples with the same values for all
common attributes, and retains only one copy of each common
column
n select *
from instructor natural join teaches;
Database System Concepts - 6th Edition
3.19
©Silberschatz, Korth and Sudarshan
Natural Join Example
n List the names of instructors along with the course ID of the courses that
they taught.
l
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID;
l
select name, course_id
from instructor natural join teaches;
select name, course_id
from instructor join teaches on instructor.ID = teaches.ID;
Database System Concepts - 6th Edition
3.20
©Silberschatz, Korth and Sudarshan
Natural Join (Cont.)
n Danger in natural join: beware of unrelated attributes with same name which
get equated incorrectly
n List the names of instructors along with the the titles of courses that they
teach
l
Incorrect version (makes course.dept_name = instructor.dept_name)
4
l
Correct version
4
l
select name, title
from instructor natural join teaches natural join course;
select name, title
from instructor natural join teaches, course
where teaches.course_id = course.course_id;
Another correct version
4
select name, title
from (instructor natural join teaches)
join course using(course_id);
Database System Concepts - 6th Edition
3.21
©Silberschatz, Korth and Sudarshan
The Rename Operation
n The SQL allows renaming relations and attributes using the as clause:
old-name as new-name
n E.g.
l
select ID, name, salary/12 as monthly_salary
from instructor
n Find the names of all instructors who have a higher salary than
some instructor in ‘Comp. Sci’.
l
select distinct T. name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name = ‘Comp. Sci.’
n Keyword as is optional and may be omitted
instructor as T ≡ instructor T
l
Keyword as must be omitted in Oracle
Database System Concepts - 6th Edition
3.22
©Silberschatz, Korth and Sudarshan
String Operations
n SQL includes a string-matching operator for comparisons on
character strings. The operator “like” uses patterns that are
described using two special characters:
l
percent (%). The % character matches any substring.
l
underscore (_). The _ character matches any character.
n Find the names of all instructors whose name includes the substring
“dar”.
select name
from instructor
where name like '%dar%'
n Match the string “100 %”
like ‘100 \%' escape '\'
Database System Concepts - 6th Edition
3.23
©Silberschatz, Korth and Sudarshan
String Operations (Cont.)
n Patters are case sensitive.
n Pattern matching examples:
l
‘Intro%’ matches any string beginning with “Intro”.
l
‘%Comp%’ matches any string containing “Comp” as a substring.
l
‘_ _ _’ matches any string of exactly three characters.
l
‘_ _ _ %’ matches any string of at least three characters.
n SQL supports a variety of string operations such as
l
concatenation (using “||”)
l
converting from upper to lower case (and vice versa)
l
finding string length, extracting substrings, etc.
Database System Concepts - 6th Edition
3.24
©Silberschatz, Korth and Sudarshan
Ordering the Display of Tuples
n List in alphabetic order the names of all instructors
select distinct name
from instructor
order by name
n We may specify desc for descending order or asc for ascending
order, for each attribute; ascending order is the default.
l
Example: order by name desc
n Can sort on multiple attributes and on renamings
l
select name, ceiling(salary/1000) as [salary in thousands]
from
instructor
order by [salary in thousands] desc, name asc
Database System Concepts - 6th Edition
3.25
©Silberschatz, Korth and Sudarshan
Where Clause Predicates
n SQL includes a between comparison operator
n Example: Find the names of all instructors with salary between
$90,000 and $100,000 (that is, ³ $90,000 and £ $100,000)
l
select name
from instructor
where salary between 90000 and 100000
n Tuple comparison
l
select name, course_id
from instructor, teaches
where (instructor.ID, dept_name) = (teaches.ID, ’Biology’);
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID and dept_name = 'Biology';
Database System Concepts - 6th Edition
3.26
©Silberschatz, Korth and Sudarshan
Duplicates
n In relations with duplicates, SQL can define how many copies
of tuples appear in the result.
n Multiset versions of some of the relational algebra operators –
given multiset relations r1 and r2:
1.
sq (r1): If there are c1 copies of tuple t1 in r1, and t1
satisfies selections sq,, then there are c1 copies of t1 in sq
(r1).
2. PA (r ): For each copy of tuple t1 in r1, there is a copy of
tuple PA (t1) in PA (r1) where PA (t1) denotes the
projection of the single tuple t1.
3. r1 x r2 : If there are c1 copies of tuple t1 in r1 and c2 copies
of tuple t2 in r2, there are c1 x c2 copies of the tuple t1. t2 in r1
x r2
Database System Concepts - 6th Edition
3.27
©Silberschatz, Korth and Sudarshan
Duplicates (Cont.)
n Example: Suppose multiset relations r1 (A, B) and r2 (C)
are as follows:
r1 = {(1, a) (2,a)}
r2 = {(2), (3), (3)}
n Then PB(r1) would be {(a), (a)}, while PB(r1) x r2 would be
{(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}
n SQL duplicate semantics:
select A1,, A2, ..., An
from r1, r2, ..., rm
where P
is equivalent to the multiset version of the expression:
Õ A1,A2 ,K,An (s P (r1 ´ r2 ´ K ´ rm ))
Database System Concepts - 6th Edition
3.28
©Silberschatz, Korth and Sudarshan
Set Operations
n Find courses that ran in Fall 2009 or in Spring 2010
(select course_id from section where semester = ‘Fall’ and year = 2009)
union
(select course_id from section where semester = ‘Spring’ and year = 2010)
n Find courses that ran in Fall 2009 and in Spring 2010
(select course_id from section where semester = ‘Fall’ and year = 2009)
intersect
(select course_id from section where semester = ‘Spring’ and year = 2010)
n Find courses that ran in Fall 2009 but not in Spring 2010
(select course_id from section where semester = ‘Fall’ and year = 2009)
except
(select course_id from section where semester = ‘Spring’ and year = 2010)
Database System Concepts - 6th Edition
3.29
©Silberschatz, Korth and Sudarshan
Set Operations
n Set operations union, intersect, and except
l
Each of the above operations automatically eliminates
duplicates
n To retain all duplicates use the corresponding multiset versions
union all, intersect all and except all.
Suppose a tuple occurs m times in r and n times in s, then, it
occurs:
l
m + n times in r union all s
l
min(m,n) times in r intersect all s
l
max(0, m – n) times in r except all s
Database System Concepts - 6th Edition
3.30
©Silberschatz, Korth and Sudarshan
Null Values
n It is possible for tuples to have a null value, denoted by null, for
some of their attributes
n null signifies an unknown value or that a value does not exist.
n The result of any arithmetic expression involving null is null
l
Example: 5 + null returns null
n The predicate is null can be used to check for null values.
l
Example: Find all instructors whose salary is null.
select name
from instructor
where salary is null
Database System Concepts - 6th Edition
3.31
©Silberschatz, Korth and Sudarshan
Null Values and Three Valued Logic
n Any comparison with null returns unknown
l
Example: 5 < null or null <> null
or
null = null
n Three-valued logic using the truth value unknown:
l
OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
l
AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
l
NOT: (not unknown) = unknown
l
“P is unknown” evaluates to true if predicate P evaluates
to unknown
n Result of where clause predicate is treated as false if it
evaluates to unknown
Database System Concepts - 6th Edition
3.32
©Silberschatz, Korth and Sudarshan
Aggregate Functions
n These functions operate on the multiset of values of a
column of a relation, and return a value
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Database System Concepts - 6th Edition
3.33
©Silberschatz, Korth and Sudarshan
Aggregate Functions (Cont.)
n Find the average salary of instructors in the Computer Science
department
l
select avg (salary)
from instructor
where dept_name= ’Comp. Sci.’;
n Find the total number of instructors who teach a course in the
Spring 2010 semester
l
select count (distinct ID)
from teaches
where semester = ’Spring’ and year = 2010
n Find the number of tuples in the course relation
l
select count (*)
from course;
Database System Concepts - 6th Edition
3.34
©Silberschatz, Korth and Sudarshan
Aggregate Functions – Group By
n Find the average salary of instructors in each department
l
select dept_name, avg (salary)
from instructor
group by dept_name;
l
Note: departments with no instructor will not appear in result
Database System Concepts - 6th Edition
3.35
©Silberschatz, Korth and Sudarshan
Aggregation (Cont.)
n Attributes in select clause outside of aggregate
functions must appear in group by list
l
/* erroneous query */
select dept_name, ID, avg (salary)
from instructor
group by dept_name;
Database System Concepts - 6th Edition
3.36
©Silberschatz, Korth and Sudarshan
Aggregate Functions – Having Clause
n Find the names and average salaries of all departments whose
average salary is greater than 42000
select dept_name, avg (salary)
from instructor
group by dept_name
having avg (salary) > 42000;
Note: predicates in the having clause are applied after the
formation of groups whereas predicates in the where
clause are applied before forming groups
Database System Concepts - 6th Edition
3.37
©Silberschatz, Korth and Sudarshan
Null Values and Aggregates
n Total all salaries
select sum (salary )
from instructor
l
Above statement ignores null amounts
l
Result is null if there is no non-null amount
n All aggregate operations except count(*) ignore tuples with null
values on the aggregated attributes
n What if collection has only null values?
l
count returns 0
l
all other aggregates return null
Database System Concepts - 6th Edition
3.38
©Silberschatz, Korth and Sudarshan
Nested Subqueries
n SQL provides a mechanism for the nesting of subqueries.
n A subquery is a select-from-where expression that is nested
within another query.
n A common use of subqueries is to perform tests for set
membership, set comparisons, and set cardinality.
Database System Concepts - 6th Edition
3.39
©Silberschatz, Korth and Sudarshan
Example Query
n Find courses offered in Fall 2009 and in Spring 2010
select distinct course_id
from section
where semester = ’Fall’ and year= 2009 and
course_id in (select course_id
from section
where semester = ’Spring’ and year= 2010);
n Find courses offered in Fall 2009 but not in Spring 2010
select distinct course_id
from section
where semester = ’Fall’ and year= 2009 and
course_id not in (select course_id
from section
where semester = ’Spring’ and year= 2010);
Database System Concepts - 6th Edition
3.40
©Silberschatz, Korth and Sudarshan
Example Query
n Find the total number of (distinct) studentswho have taken
course sections taught by the instructor with a given ID
select count (distinct ID)
from takes
where course_id in
(select course_id
from teaches
where teaches.ID=14365);
n
Note: Try without distinct
Database System Concepts - 6th Edition
3.41
©Silberschatz, Korth and Sudarshan
Set Comparison
n Find names of instructors with salary greater than that of some
(at least one) instructor in the Biology department.
select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name = ’Biology’;
n
Same query using > some clause
select name
from instructor
where salary > some (select salary
from instructor
where dept_name = ’Biology’);
Database System Concepts - 6th Edition
3.42
©Silberschatz, Korth and Sudarshan
Definition of Some Clause
n F <comp> some r Û $ t Î r such that (F <comp> t )
Where <comp> can be: <, £, >, =, ¹
0
5
6
) = true
(5 < some
0
5
) = false
(5 = some
0
5
) = true
(5 ¹ some
0
5
) = true (since 0 ¹ 5)
(5 < some
(read: 5 < some tuple in the relation)
(= some) º in
However, (¹ some) º not in
Database System Concepts - 6th Edition
3.43
©Silberschatz, Korth and Sudarshan
Example Query
n Find the names of all instructors whose salary is greater than
the salary of all instructors in the Biology department.
select name
from instructor
where salary > all (select salary
from instructor
where dept_name = ’Biology’);
Database System Concepts - 6th Edition
3.44
©Silberschatz, Korth and Sudarshan
Definition of all Clause
n F <comp> all r Û " t Î r (F <comp> t)
(5 < all
0
5
6
) = false
(5 < all
6
10
) = true
(5 = all
4
5
) = false
(5 ¹ all
4
6
) = true (since 5 ¹ 4 and 5 ¹ 6)
(¹ all) º not in
However, (= all) º in
Database System Concepts - 6th Edition
3.45
©Silberschatz, Korth and Sudarshan
Test for Empty Relations
n The exists construct returns the value true if the argument
subquery is nonempty.
n exists r Û r ¹ Ø
n not exists r Û r = Ø
Database System Concepts - 6th Edition
3.46
©Silberschatz, Korth and Sudarshan
Correlation Variables
n Yet another way of specifying the query “Find all courses
taught in both the Fall 2009 semester and in the Spring 2010
semester”
select course_id
from section as S
where semester = ’Fall’ and year= 2009 and
exists (select *
from section as T
where semester = ’Spring’ and year= 2010
and S.course_id= T.course_id);
n Correlated subquery
n Correlation name or correlation variable
Database System Concepts - 6th Edition
3.47
©Silberschatz, Korth and Sudarshan
Not Exists
n Find all students who have taken all courses offered in the
Biology department.
select distinct S.ID, S.name
from student as S
where not exists ( (select course_id
from course
where dept_name = ’Biology’)
except
(select T.course_id
from takes as T
where S.ID = T.ID));
n
Note that X – Y = Ø Û X Í Y
n
Note: Cannot write this query using = all and its variants
Database System Concepts - 6th Edition
3.48
©Silberschatz, Korth and Sudarshan
Test for Absence of Duplicate Tuples
n The unique construct tests whether a subquery has any duplicate
tuples in its result.
l
(Evaluates to “true” on an empty set)
n Find all courses that were offered at most once in 2008
select T.course_id, T.title
from course as T
where unique
(select R.course_id
from section as R
where T.course_id= R.course_id and R.year = 2008);
select T.course_id, T.title
from course as T
where T.course_id in (select R.course_id
from section as R
where R.year=2008
group by R.course_id
having COUNT(*) = 1);
Database System Concepts - 6th Edition
3.49
©Silberschatz, Korth and Sudarshan
Subqueries in the From Clause
n SQL allows a subquery expression to be used in the from clause
n Find the average instructors’ salaries of those departments where the
average salary is greater than $42,000.
select dept_name, avg_salary
from (select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name)
where avg_salary > 42000;
n Note that we do not need to use the having clause
n Another way to write above query
select dept_name, avg_salary
from (select dept_name, avg (salary)
from instructor
group by dept_name)
as dept_avg (dept_name, avg_salary)
where avg_salary > 42000;
Database System Concepts - 6th Edition
3.50
©Silberschatz, Korth and Sudarshan
Subqueries in the From Clause (Cont.)
n And yet another way to write it: lateral clause
select name, salary, avg_salary
from instructor I1,
lateral (select avg(salary) as avg_salary
from instructor I2
where I2.dept_name= I1.dept_name);
n Lateral clause permits later part of the from clause (after the lateral
keyword) to access correlation variables from the earlier part.
n Note: lateral is part of the SQL standard, but is not supported on
many database systems; some databases such as SQL Server offer
alternative syntax
FIND THE
ALTERNATIVE
SYNTAX
Database System Concepts - 6th Edition
3.51
©Silberschatz, Korth and Sudarshan
With Clause
n The with clause provides a way of defining a temporary view
whose definition is available only to the query in which the with
clause occurs.
n Find all departments with the maximum budget
with max_budget (value) as
(select max(budget)
from department)
select budget
from department, max_budget
where department.budget = max_budget.value;
Database System Concepts - 6th Edition
3.52
©Silberschatz, Korth and Sudarshan
Complex Queries using With Clause
n With clause is very useful for writing complex queries
n Supported by most database systems, with minor syntax
variations
n Find all departments where the total salary is greater than the
average of the total salary at all departments
with dept_total (dept_name, value) as
(select dept_name, sum(salary)
from instructor
group by dept_name),
dept_total_avg(value) as
(select avg(value)
from dept_total)
select dept_name
from dept_total, dept_total_avg
where dept_total.value >= dept_total_avg.value;
Database System Concepts - 6th Edition
3.53
©Silberschatz, Korth and Sudarshan
Scalar Subquery
n Scalar subquery is one which is used where a single value is expected
n
E.g. select dept_name,
(select count(*)
from instructor
where department.dept_name = instructor.dept_name)
as num_instructors
from department;
n E.g. select name
from instructor
where salary * 10 >
(select budget from department
where department.dept_name = instructor.dept_name)
n Runtime error if subquery returns more than one result tuple
Database System Concepts - 6th Edition
3.54
©Silberschatz, Korth and Sudarshan
Modification of the Database
n Deletion of tuples from a given relation
n Insertion of new tuples into a given relation
n Updating values in some tuples in a given relation
Database System Concepts - 6th Edition
3.55
©Silberschatz, Korth and Sudarshan
Modification of the Database – Deletion
n Delete all instructors
delete from instructor
n Delete all instructors from the Finance department
delete from instructor
where dept_name= ’Finance’;
n Delete all tuples in the instructor relation for those instructors
associated with a department located in the Watson building.
delete from instructor
where dept_name in (select dept_name
from department
where building = ’Watson’);
Database System Concepts - 6th Edition
3.56
©Silberschatz, Korth and Sudarshan
Deletion (Cont.)
n Delete all instructors whose salary is less than the average
salary of instructors
delete from instructor
where salary< (select avg (salary) from instructor);
l
Problem: as we delete tuples from deposit, the average salary
changes
l
Solution used in SQL:
1. First, compute avg salary and find all tuples to delete
2. Next, delete all tuples found above (without recomputing avg or
retesting the tuples)
Database System Concepts - 6th Edition
3.57
©Silberschatz, Korth and Sudarshan
Modification of the Database – Insertion
n Add a new tuple to course
insert into course
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
n or equivalently
insert into course (course_id, title, dept_name, credits)
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
n Add a new tuple to student with tot_creds set to null
insert into student
values (’3003’, ’Green’, ’Finance’, null);
Database System Concepts - 6th Edition
3.58
©Silberschatz, Korth and Sudarshan
Insertion (Cont.)
n Add all instructors to the student relation with tot_creds set to 0
insert into student
select ID, name, dept_name, 0
from instructor
n The select from where statement is evaluated fully before any of
its results are inserted into the relation (otherwise queries like
insert into table1 select * from table1
would cause problems, if table1 did not have any primary key
defined.
Database System Concepts - 6th Edition
3.59
©Silberschatz, Korth and Sudarshan
Modification of the Database – Updates
n Increase salaries of instructors whose salary is over $100,000 by
3%, and all others receive a 5% raise
l
Write two update statements:
update instructor
set salary = salary * 1.03
where salary > 100000;
update instructor
set salary = salary * 1.05
where salary <= 100000;
l
The order is important
l
Can be done better using the case statement (next slide)
Database System Concepts - 6th Edition
3.60
©Silberschatz, Korth and Sudarshan
Case Statement for Conditional Updates
n Same query as before but with case statement
update instructor
set salary = case
when salary <= 100000 then salary * 1.05
else salary * 1.03
end
Database System Concepts - 6th Edition
3.61
©Silberschatz, Korth and Sudarshan
Updates with Scalar Subqueries
n Recompute and update tot_creds value for all students
update student
set tot_cred = ( select sum(credits)
from takes natural join course
where student.ID= takes.ID and
takes.grade <> ’F’ and
takes.grade is not null);
replace
natural join
n Sets tot_creds to null for students who have not taken any course
l
Just insert a random student in the student table and check it
n Instead use:
update student
set tot_cred =
(select case
when sum(credits) is not null then sum(credits) else 0
end
from takes join course on takes.course_id = course.course_id
where student.ID= takes.ID and takes.grade <> 'F' and takes.grade is not null)
Database System Concepts - 6th Edition
3.62
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
End
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Das Bild kann zurzeit nicht angezeigt werden.
Intermediate SQL
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Intermediate SQL
n Join Expressions
n Views
n Transactions
n Integrity Constraints
n SQL Data Types and Schemas
n Authorization
Database System Concepts - 6th Edition
4.2
©Silberschatz, Korth and Sudarshan
Joined Relations
n Join operations take two relations and return as a result
another relation.
n A join operation is a Cartesian product which requires that
tuples in the two relations match (under some condition).
It also specifies the attributes that are present in the result
of the join
n The join operations are typically used as subquery
expressions in the from clause
Database System Concepts - 6th Edition
4.3
©Silberschatz, Korth and Sudarshan
Join operations – Example
n Relation course
n Relation prereq
n Observe that
prereq information is missing for CS-315 and
course information is missing for CS-437
Database System Concepts - 6th Edition
4.4
©Silberschatz, Korth and Sudarshan
Outer Join
n An extension of the join operation that avoids loss of
information.
n Computes the join and then adds tuples form one relation
that does not match tuples in the other relation to the result
of the join.
n Uses null values.
Database System Concepts - 6th Edition
4.5
©Silberschatz, Korth and Sudarshan
Left Outer Join
n course natural left outer join prereq
select *
from course left outer join prereq on course.course_id = prereq.course_id;
Why SQL Server Doesn’t Support Natural Join Syntax?
Very nice for writing quick queries.
Other major databases, such as MySQL and Oracle, do support natural joins.
However, natural joins have some downsides
1. Because natural joins are implicit, there is no way to see what columns will be used in
the join. You might not get what you think you’re getting.
2. If a column name or type is altered or the column is removed from one of the tables, the
next time the SELECT statement is run the join will break.
Database System Concepts - 6th Edition
4.6
©Silberschatz, Korth and Sudarshan
Right Outer Join
n course natural right outer join prereq
select *
from course right outer join prereq on course.course_id = prereq.course_id;
Database System Concepts - 6th Edition
4.7
©Silberschatz, Korth and Sudarshan
Full Outer Join
n course natural full outer join prereq
select *
from course full outer join prereq on course.course_id = prereq.course_id;
Database System Concepts - 6th Edition
4.8
©Silberschatz, Korth and Sudarshan
Joined Relations
n Join operations take two relations and return as a result
another relation.
n These additional operations are typically used as subquery
expressions in the from clause
n Join condition – defines which tuples in the two relations
match, and what attributes are present in the result of the join.
n Join type – defines how tuples in each relation that do not
match any tuple in the other relation (based on the join
condition) are treated.
Database System Concepts - 6th Edition
4.9
©Silberschatz, Korth and Sudarshan
Various forms of join conditions
n course inner join prereq on
course.course_id = prereq.course_id
n What is the difference between the above, and a natural join?
n course left outer join prereq on
course.course_id = prereq.course_id
Database System Concepts - 6th Edition
4.10
©Silberschatz, Korth and Sudarshan
Various forms of join conditions
n course natural right outer join prereq
n course full outer join prereq using (course_id)
Database System Concepts - 6th Edition
4.11
©Silberschatz, Korth and Sudarshan
Views
n In some cases, it is not desirable for all users to see the entire
logical model (that is, all the actual relations stored in the
database.)
n Consider a person who needs to know an instructors name
and department, but not the salary. This person should see a
relation described, in SQL, by
select ID, name, dept_name
from instructor
n A view provides a mechanism to hide certain data from the
view of certain users.
n Any relation that is not of the conceptual model but is made
visible to a user as a “virtual relation” is called a view.
Database System Concepts - 6th Edition
4.12
©Silberschatz, Korth and Sudarshan
View Definition
n A view is defined using the create view statement which has
the form
create view v as < query expression >
where <query expression> is any legal SQL expression. The
view name is represented by v.
n Once a view is defined, the view name can be used to refer to
the virtual relation that the view generates.
n View definition is not the same as creating a new relation by
evaluating the query expression
l
Rather, a view definition causes the saving of an expression;
the expression is substituted into queries using the view.
Database System Concepts - 6th Edition
4.13
©Silberschatz, Korth and Sudarshan
Example Views
n A view of instructors without their salary
create view faculty as
select ID, name, dept_name
from instructor
n Find all instructors in the Biology department
select name
from faculty
where dept_name = ‘Biology’
n Create a view of department salary totals
create view departments_total_salary(dept_name, total_salary) as
select dept_name, sum (salary)
from instructor
group by dept_name;
Database System Concepts - 6th Edition
4.14
©Silberschatz, Korth and Sudarshan
Views Defined Using Other Views
n create view physics_fall_2009 as
select course.course_id, sec_id, building, room_number
from course, section
where course.course_id = section.course_id
and course.dept_name = ’Physics’
and section.semester = ’Fall’
and section.year = 2009;
n create view physics_fall_2009_watson as
select course_id, room_number
from physics_fall_2009
where building= ’Watson’;
Database System Concepts - 6th Edition
4.15
©Silberschatz, Korth and Sudarshan
View Expansion
n Expand use of a view in a query/another view
create view physics_fall_2009_watson as
select course_id, room_number
from (select course.course_id, building, room_number
from course, section
where course.course_id = section.course_id
and course.dept_name = ’Physics’
and section.semester = ’Fall’
and section.year = 2009)
where building= ’Watson’;
Database System Concepts - 6th Edition
4.16
©Silberschatz, Korth and Sudarshan
Views Defined Using Other Views
n One view may be used in the expression defining another view
n A view relation v1 is said to depend directly on a view relation
v2 if v2 is used in the expression defining v1
n A view relation v1 is said to depend on view relation v2 if either
v1 depends directly to v2 or there is a path of dependencies
from v1 to v2
n A view relation v is said to be recursive if it depends on itself.
Database System Concepts - 6th Edition
4.17
©Silberschatz, Korth and Sudarshan
View Expansion
n A way to define the meaning of views defined in terms of other
views.
n Let view v1 be defined by an expression e1 that may itself
contain uses of view relations.
n View expansion of an expression repeats the following
replacement step:
repeat
Find any view relation vi in e1
Replace the view relation vi by the expression defining vi
until no more view relations are present in e1
n As long as the view definitions are not recursive, this loop will
terminate
Database System Concepts - 6th Edition
4.18
©Silberschatz, Korth and Sudarshan
Update of a View
n Add a new tuple to faculty view which we defined earlier
insert into faculty values (’30765’, ’Green’, ’History’);
This insertion must be represented by the insertion of the tuple
(’30765’, ’Green’, ’History’, null)
into the instructor relation
Database System Concepts - 6th Edition
4.19
©Silberschatz, Korth and Sudarshan
Some Updates cannot be Translated Uniquely
n
create view instructor_info as
select ID, name, building
from instructor, department
where instructor.dept_name= department.dept_name;
n /*caution: next query causes error*/
insert into instructor_info values (69987, ’White’, ’Taylor’);
4 which
4 what
department, if multiple departments in Taylor?
if no department is in Taylor?
n Most SQL implementations allow updates only on simple views
l
The from clause has only one database relation.
l
The select clause contains only attribute names of the
relation, and does not have any expressions, aggregates, or
distinct specification.
l
Any attribute not listed in the select clause can be set to null
l
The query does not have a group by or having clause.
Database System Concepts - 6th Edition
4.20
©Silberschatz, Korth and Sudarshan
And Some Not at All
n create view history_instructors as
select ID, name, dept_name, salary
from instructor
where dept_name= ’History’;
n What happens if we
insert into history_instructors values(25566, ’Brown’, ’Biology’, 100000)
into history_instructors?
n Logical error!
Database System Concepts - 6th Edition
4.21
©Silberschatz, Korth and Sudarshan
Materialized Views
n Materializing a view: create a physical table containing all the tuples
in the result of the query defining the view
n If relations used in the query are updated, the materialized view result
becomes out of date
l
Need to maintain the view, by updating the view whenever the
underlying relations are updated.
create view dbo. history_instructors_materialized with schemabinding
as
select ID, name, dept_name, salary
from dbo.instructor
where dept_name= 'History';
Imagine that you have created a view without SCHEMABINDING and you have
altered the schema of underlying table (deleted one column). Next time when you run
your view, it will fail. Try it with a change in instructors (e.g., name -> surname)
Creating a view with SCHEMABINDING option locks the underlying tables and
prevents any changes that may change the table schema.
Remember that the object should be referred by their two-part name
(ownername.objectname) eg: dbo.instructor
Database System Concepts - 6th Edition
4.22
©Silberschatz, Korth and Sudarshan
Integrity Constraints
n Integrity constraints guard against accidental damage to the
database, by ensuring that authorized changes to the
database do not result in a loss of data consistency.
l
A checking account must have a balance greater than
$10,000.00
l
A salary of a bank employee must be at least $4.00 an
hour
l
A customer must have a (non-null) phone number
Database System Concepts - 6th Edition
4.23
©Silberschatz, Korth and Sudarshan
Integrity Constraints on a Single Relation
n not null
n primary key
n unique
n check (P), where P is a predicate
Database System Concepts - 6th Edition
4.24
©Silberschatz, Korth and Sudarshan
Not Null and Unique Constraints
n not null
l
Declare name and budget to be not null
name varchar(20) not null
budget numeric(12,2) not null
n unique ( A1, A2, …, Am)
l
The unique specification states that the attributes A1, A2, …
Am
form a candidate key.
l
Candidate keys are permitted to be null (in contrast to primary
keys).
Database System Concepts - 6th Edition
4.25
©Silberschatz, Korth and Sudarshan
The check clause
n check (P)
where P is a predicate
Example: ensure that semester is one of fall, winter, spring
or summer:
create table section (
course_id varchar (8),
sec_id varchar (8),
semester varchar (6),
year numeric (4,0),
building varchar (15),
room_number varchar (7),
time_slot _id varchar (4),
primary key (course_id, sec_id, semester, year),
check (semester in (’Fall’, ’Winter’, ’Spring’, ’Summer’))
);
Try it with: insert into section values('105', '1', 'Sommer', 2009, 'Chandler', '375', 'C')
Database System Concepts - 6th Edition
4.26
©Silberschatz, Korth and Sudarshan
Referential Integrity
n Ensures that a value that appears in one relation for a given
set of attributes also appears for a certain set of attributes in
another relation.
l
Example: If “Biology” is a department name appearing in
one of the tuples in the instructor relation, then there exists
a tuple in the department relation for “Biology”.
n Let A be a set of attributes. Let R and S be two relations that
contain attributes A and where A is the primary key of S. A is
said to be a foreign key of R if for any values of A appearing
in R these values also appear in S.
Database System Concepts - 6th Edition
4.27
©Silberschatz, Korth and Sudarshan
Cascading Actions in Referential Integrity
n create table ref_course (
course_id char(5) primary key,
title
varchar(20),
dept_name varchar(20) references department
)
n create table ref_course _cascade(
course_id char(5) primary key,
title
varchar(20),
dept_name varchar(20),
foreign key (dept_name) references department
on delete cascade
on update cascade
)
n alternative actions to cascade: set null, set default
Database System Concepts - 6th Edition
4.28
©Silberschatz, Korth and Sudarshan
Cascading Actions in Referential Integrity
Try it:
n insert into ref_course
values('12345', 'Introduction', ‘Football');
n delete from department
where dept_name = 'Athletics'
n insert into ref_course_cascade
values('54321', 'Black Holes', 'Astronomy');
n delete from department
where dept_name = 'Astronomy‘
n select * from ref_course_cascade
Database System Concepts - 6th Edition
4.29
©Silberschatz, Korth and Sudarshan
Constraint Violation During Transactions
create table person (
ID char(10),
name char(40),
spouse char(10),
primary key ID,
foreign key spouse references person)
n How to insert tuples without causing constraint violation?
Example: we want to insert John and Mary who are married
l
insert into person values (‘123’, ‘John’, ‘Mary’);
l
insert into person values (‘321’, ‘Mary’, ‘John’);
l
set spouse to null initially, update after inserting all persons (not
possible if spouse attributes declared to be not null)
l
OR defer constraint checking with INITIALLY_DEFERRED causes
constraints to be checked at the end of a transaction
many database implementations do not support deferred constraint checking
Database System Concepts - 6th Edition
4.30
©Silberschatz, Korth and Sudarshan
Complex Check Clauses
n check constraints with subqueries:
check (time_slot_id in (select time_slot_id from time_slot))
l
Same as using a foreign key
n Check every section has at least one instructor teaching the section:
l
Set attributes (course id, sec id, semester, year) of section relation as
foreign key referencing the corresponding attributes of the teaches
relation
l
(course id, sec id, semester, year) are not a candidate key of teaches
relation (we need also the ID of teacher)
l
check ((course id, sec id, semester, year) in (select course id, sec id,
semester, year from teaches)) is a solution
n Unfortunately: subquery in check clause not supported by pretty
much any database
l
Alternative: triggers (not covered)
n create assertion <assertion-name> check <predicate>;
l
Also not supported by anyone
Database System Concepts - 6th Edition
4.31
©Silberschatz, Korth and Sudarshan
User-Defined Types
n create type construct in SQL creates user-defined type
create type Dollars as numeric (12,2) final
create type Dollars from numeric(12,2)
l
create table department
(dept_name varchar (20),
building varchar (15),
budget Dollars);
Database System Concepts - 6th Edition
4.32
©Silberschatz, Korth and Sudarshan
Domains
n create domain construct in SQL-92 creates user-defined
domain types
create domain person_name char(20) not null
create type person_name char(20) not null
n Types and domains are similar. Domains can have
constraints, such as not null, specified on them.
Database System Concepts - 6th Edition
4.33
©Silberschatz, Korth and Sudarshan
Large-Object Types
n Large objects (photos, videos, CAD files, etc.) are stored as a
large object:
l
blob: binary large object -- object is a large collection of
uninterpreted binary data (whose interpretation is left to an
application outside of the database system)
l
clob: character large object -- object is a large collection of
character data
l
When a query returns a large object, a pointer is returned
rather than the large object itself.
The SQL Server ntext, text, and image data types are capable of holding
extremely large amounts of data, up to 2 GB, in a single value.
Database System Concepts - 6th Edition
4.34
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
End
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Das Bild kann zurzeit nicht angezeigt werden.
Advanced SQL
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Advanced SQL
n Functions and Procedural Constructs
n Recursion
n Ranking
Database System Concepts - 6th Edition
5.2
©Silberschatz, Korth and Sudarshan
Procedural Constructs in SQL
Database System Concepts - 6th Edition
5.3
©Silberschatz, Korth and Sudarshan
Procedural Extensions and Stored Procedures
n SQL provides a module language
l
Permits definition of procedures in SQL, with if-then-else
statements, for and while loops, etc.
n Stored Procedures
l
Can store procedures in the database
l
then execute them using the call statement
l
permit external applications to operate on the database
without knowing about internal details
Database System Concepts - 6th Edition
5.4
©Silberschatz, Korth and Sudarshan
Functions and Procedures
n SQL:1999 supports functions and procedures
l
Functions/procedures can be written in SQL itself, or in an
external programming language.
l
Functions are particularly useful with specialized data types such
as images and geometric objects.
4 Example:
functions to check if polygons overlap, or to
compare images for similarity.
l
Some database systems support table-valued functions, which
can return a relation as a result.
n SQL:1999 also supports a rich set of imperative constructs, including
l
Loops, if-then-else, assignment
n Many databases have proprietary procedural extensions to SQL that
differ from SQL:1999.
Database System Concepts - 6th Edition
5.5
©Silberschatz, Korth and Sudarshan
SQL Functions
n Define a function that, given the name of a department,
returns the count of the number of instructors in that
department.
create function dept_count (dept_name varchar(20))
returns integer
begin
declare d_count integer;
select count (* ) into d_count
from instructor
where instructor.dept_name = dept_name
return d_count;
end
create function dept_count(@dept_name varchar(20))
returns int
as
begin
declare @d_count int;
select @d_count = count (*)
from instructor
where instructor.dept_name = @dept_name
return @d_count;
end
Database System Concepts - 6th Edition
5.6
©Silberschatz, Korth and Sudarshan
SQL Functions
n Find the department name and budget of all
departments with more that 12 instructors.
select dept_name, budget
from department
where dept_count (dept_name ) > 1
select dept_name, budget
from department
where dbo.dept_count (dept_name ) > 1
Database System Concepts - 6th Edition
5.7
©Silberschatz, Korth and Sudarshan
Table Functions
n
SQL:2003 added functions that return a relation as a result
n
Example: Return all instructors of a given department
create function instructors_of (dept_name char(20)
returns table ( ID varchar(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
return table
(select ID, name, dept_name, salary
from instructor
where instructor.dept_name = instructors_of.dept_name)
create function instructors_of(@dept_name varchar(20))
returns @instructors_of_table table
(ID varchar(5), name varchar(20), dept_name varchar(20), salary numeric(8,2) )
as begin
insert @instructors_of_table
select instructor.ID, instructor.name, instructor.dept_name, instructor.salary
from instructor
where instructor.dept_name = @dept_name;
return;
end
Database System Concepts - 6th Edition
5.8
©Silberschatz, Korth and Sudarshan
Table Functions
n Usage
select *
from table (instructors_of (‘Music’))
select *
from instructors_of('Music')
Database System Concepts - 6th Edition
5.9
©Silberschatz, Korth and Sudarshan
SQL Procedures
n The dept_count function could instead be written as procedure:
create procedure dept_count_proc (in dept_name varchar(20),
out d_count integer)
begin
select count(*) into d_count
from instructor
where instructor.dept_name = dept_count_proc.dept_name
end
create procedure dept_count_proc
@dept_name varchar(20),
@d_count int OUTPUT
as
select @d_count = count(*)
from instructor
where instructor.dept_name = @dept_name
/* optional printing */
print 'The count is ' + RTRIM(CAST(@d_count AS varchar(20)))
Database System Concepts - 6th Edition
5.10
©Silberschatz, Korth and Sudarshan
SQL Procedures
n Procedures can be invoked either from an SQL procedure or from
embedded SQL, using the call statement.
declare d_count integer;
call dept_count_proc( ‘Physics’, d_count);
Procedures and functions can be invoked also from dynamic SQL
declare @d_count int
EXECUTE dept_count_proc 'Comp. Sci.', @d_count
n SQL:1999 allows more than one function/procedure of the same
name (called name overloading), as long as the number of
arguments differ, or at least the types of the arguments differ
Database System Concepts - 6th Edition
5.11
©Silberschatz, Korth and Sudarshan
Procedural Constructs
n Warning: most database systems implement their own variant of the
standard syntax below
l
read your system manual to see what works on your system
n Compound statement: begin … end,
l
May contain multiple SQL statements between begin and end.
l
Local variables can be declared within a compound statements
n Whileand repeat statements :
declare n integer default 0;
while n < 10 do
set n = n + 1
end while
repeat
set n = n – 1
until n = 0
end repeat
Database System Concepts - 6th Edition
5.12
©Silberschatz, Korth and Sudarshan
Procedural Constructs (Cont.)
n For loop
l
Permits iteration over all results of a query
l
Example:
declare n integer default 0;
for r as
select budget from department
where dept_name = ‘Music’
do
set n = n - r.budget
end for
Database System Concepts - 6th Edition
5.13
©Silberschatz, Korth and Sudarshan
Procedural Constructs (cont.)
n Conditional statements (if-then-else)
SQL:1999 also supports a case statement similar to C case statement
n Example procedure: registers student after ensuring classroom capacity
is not exceeded
l
Returns 0 on success and -1 if capacity is exceeded
l
See book for details
n Signaling of exception conditions, and declaring handlers for exceptions
declare out_of_classroom_seats condition
declare exit handler for out_of_classroom_seats
begin
…
.. signal out_of_classroom_seats
end
l
The handler here is exit -- causes enclosing begin..end to be exited
l
Other actions possible on exception
Database System Concepts - 6th Edition
5.14
©Silberschatz, Korth and Sudarshan
External Language Functions/Procedures
n SQL:1999 permits the use of functions and procedures written in
other languages such as C or C++
n Declaring external language procedures and functions
create procedure dept_count_proc(in dept_name varchar(20),
out count integer)
language C
external name ’ /usr/avi/bin/dept_count_proc’
create function dept_count(dept_name varchar(20))
returns integer
language C
external name ‘/usr/avi/bin/dept_count’
Database System Concepts - 6th Edition
5.15
©Silberschatz, Korth and Sudarshan
External Language Routines (Cont.)
n Benefits of external language functions/procedures:
l
more efficient for many operations, and more expressive
power.
n Drawbacks
l
Code to implement function may need to be loaded into
database system and executed in the database system’s
address space.
4 risk
of accidental corruption of database structures
4 security
risk, allowing users access to unauthorized data
l
There are alternatives, which give good security at the cost of
potentially worse performance.
l
Direct execution in the database system’s space is used when
efficiency is more important than security.
Database System Concepts - 6th Edition
5.16
©Silberschatz, Korth and Sudarshan
Security with External Language Routines
n To deal with security problems
l
Use sandbox techniques
4 that
is use a safe language like Java, which cannot be
used to access/damage other parts of the database
code.
l
Or, run external language functions/procedures in a
separate process, with no access to the database process’
memory.
4 Parameters
and results communicated via inter-process
communication
n Both have performance overheads
n Many database systems support both above approaches as
well as direct executing in database system address space.
Database System Concepts - 6th Edition
5.17
©Silberschatz, Korth and Sudarshan
Recursive Queries
Database System Concepts - 6th Edition
5.18
©Silberschatz, Korth and Sudarshan
Recursion in SQL
n SQL:1999 permits recursive view definition
n Example: find which courses are a prerequisite, whether
directly or indirectly, for a specific course
with recursive rec_prereq(course_id, prereq_id) as (
select course_id, prereq_id
from prereq
union
select rec_prereq.course_id, prereq.prereq_id,
from rec_rereq, prereq
where rec_prereq.prereq_id = prereq.course_id
)
select ∗
from rec_prereq;
This example view, rec_prereq, is called the transitive closure
of the prereq relation
Note: 1st printing of 6th ed erroneously used c_prereq in place of
rec_prereq in some places
Database System Concepts - 6th Edition
5.19
©Silberschatz, Korth and Sudarshan
Recursion in SQL
with rec_prereq as (
select course_id, prereq_id
from prereq
union all
select rec_prereq.course_id, prereq.prereq_id
from rec_prereq inner join prereq on
rec_prereq.prereq_id = prereq.course_id
)
select *
from rec_prereq OPTION (MAXRECURSION 5);
Check:
course_id = 972, prereq_id = 139 because
course_id = 972, prereq_id = 958 and
course_id = 958, prereq_id = 139
Uncomment the 2 entries of largeRelationsInsertFile.sql about insertions in
prereq. What happens now?
Database System Concepts - 6th Edition
5.20
©Silberschatz, Korth and Sudarshan
The Power of Recursion
n Recursive views make it possible to write queries, such as
transitive closure queries, that cannot be written without recursion
or iteration.
l
Intuition: Without recursion, a non-recursive non-iterative
program can perform only a fixed number of joins of prereq
with itself
4 This
can give only a fixed number of levels of managers
4 Given
a fixed non-recursive query, we can construct a
database with a greater number of levels of prerequisites on
which the query will not work
Database System Concepts - 6th Edition
5.21
©Silberschatz, Korth and Sudarshan
Advanced Aggregation Features
Database System Concepts - 6th Edition
5.24
©Silberschatz, Korth and Sudarshan
Ranking
n Ranking is done in conjunction with an order by specification.
n Suppose we are given a relation
student_grades(ID, GPA) (see Exercise 2A.2c)
giving the grade-point average of each student
n Find the rank of each student.
select ID, rank() over (order by GPA desc) as s_rank
from student_grades
n An extra order by clause is needed to get them in sorted order
select ID, rank() over (order by GPA desc) as s_rank
from student_grades
order by s_rank
n Ranking may leave gaps: e.g. if 2 students have the same top GPA,
both have rank 1, and the next rank is 3
l
dense_rank does not leave gaps, so next dense rank would be 2
Database System Concepts - 6th Edition
5.25
©Silberschatz, Korth and Sudarshan
Ranking
n Ranking can be done using basic SQL aggregation, but
resultant query is very inefficient
select ID, (1 + (select count(*)
from student_grades B
where B.GPA > A.GPA)) as s_rank
from student_grades A
order by s_rank;
the rank of a student is merely 1 plus the number of students with a higher GPA
overall time quadratic in the size of the relation
Database System Concepts - 6th Edition
5.26
©Silberschatz, Korth and Sudarshan
Ranking (Cont.)
create view dept_grades as
select student.ID, student.dept_name, student_grades.GPA
from student join student_grades on student.ID = student_grades.ID
n Ranking can be done within partition of the data.
n “Find the rank of students within each department.”
select ID, dept_name,
rank () over (partition by dept_name order by GPA desc)
as dept_rank
from dept_grades
order by dept_name, dept_rank;
n Multiple rank clauses can occur in a single select clause.
n Ranking is done after applying group by clause/aggregation
n Can be used to find top-n results
l
More general than the limit n clause supported by many
databases, since it allows top-n within each partition
select top n
Database System Concepts - 6th Edition
5.27
©Silberschatz, Korth and Sudarshan
Ranking (Cont.)
n Other ranking functions:
l
percent_rank (within partition, if partitioning is done)
l
cume_dist (cumulative distribution)
4
l
fraction of tuples with preceding values
row_number (non-deterministic in presence of duplicates)
n SQL:1999 permits the user to specify nulls first or nulls last
select ID,
rank ( ) over (order by GPA desc nulls last) as s_rank
from student_grades
select ID, rank ( ) over
(order by (CASE WHEN GPA IS NULL THEN 1.79E+308 ELSE GPA END) desc)
as s_rank
from student_grades
Database System Concepts - 6th Edition
5.28
©Silberschatz, Korth and Sudarshan
Ranking (Cont.)
n For a given constant n, the ranking the function ntile(n) takes
the tuples in each partition in the specified order, and divides
them into n buckets with equal numbers of tuples.
n E.g.,
select ID, ntile(4) over (order by GPA desc) as quartile
from student_grades;
Database System Concepts - 6th Edition
5.29
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
End
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Das Bild kann zurzeit nicht angezeigt werden.
Entity-Relationship Model
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Entity-Relationship Model
n Design Process
n Modeling
n Constraints
n E-R Diagram
n Design Issues
n Weak Entity Sets
n Extended E-R Features
n Design of the Bank Database
n Reduction to Relation Schemas
n Database Design
Database System Concepts - 6th Edition
7.2
©Silberschatz, Korth and Sudarshan
Modeling
n A database can be modeled as:
l
a collection of entities,
l
relationship among entities.
n An entity is an object that exists and is distinguishable from other
objects.
l
Example: specific person, company, event, plant
n Entities have attributes
l
Example: people have names and addresses
n An entity set is a set of entities of the same type that share the same
properties.
l
Example: set of all persons, companies, trees, holidays
Database System Concepts - 6th Edition
7.3
©Silberschatz, Korth and Sudarshan
Entity Sets instructor and student
instructor_ID instructor_name
Database System Concepts - 6th Edition
student-ID student_name
7.4
©Silberschatz, Korth and Sudarshan
Relationship Sets
n A relationship is an association among several entities
Example:
44553 (Peltier)
student entity
advisor
relationship set
22222 (Einstein)
instructor entity
n A relationship set is a mathematical relation among n ³ 2 entities, each
taken from entity sets
{(e1, e2, … en) | e1 Î E1, e2 Î E2, …, en Î En}
where (e1, e2, …, en) is a relationship
l
Example:
(44553,22222) Î advisor
Database System Concepts - 6th Edition
7.5
©Silberschatz, Korth and Sudarshan
Relationship Set advisor
Database System Concepts - 6th Edition
7.6
©Silberschatz, Korth and Sudarshan
Relationship Sets (Cont.)
n An attribute can also be property of a relationship set.
n For instance, the advisor relationship set between entity sets
instructor and student may have the attribute date which tracks when
the student started being associated with the advisor
Database System Concepts - 6th Edition
7.7
©Silberschatz, Korth and Sudarshan
Degree of a Relationship Set
n binary relationship
l
involve two entity sets (or degree two).
l
most relationship sets in a database system are binary.
n Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
4 Example: students work on research projects under the
guidance of an instructor.
4 relationship proj_guide is a ternary relationship between
instructor, student, and project
Database System Concepts - 6th Edition
7.8
©Silberschatz, Korth and Sudarshan
Attributes
n An entity is represented by a set of attributes, that is descriptive
properties possessed by all members of an entity set.
l
Example:
instructor = (ID, name, street, city, salary )
course= (course_id, title, credits)
n Domain – the set of permitted values for each attribute
n Attribute types:
l
Simple and composite attributes.
l
Single-valued and multivalued attributes
4
l
Example: multivalued attribute: phone_numbers
Derived attributes
4
Can be computed from other attributes
4
Example: age, given date_of_birth
Database System Concepts - 6th Edition
7.9
©Silberschatz, Korth and Sudarshan
Composite Attributes
Database System Concepts - 6th Edition
7.10
©Silberschatz, Korth and Sudarshan
Mapping Cardinality Constraints
n Express the number of entities to which another entity can be
associated via a relationship set.
n Most useful in describing binary relationship sets.
n For a binary relationship set the mapping cardinality must be one of
the following types:
l
One to one
l
One to many
l
Many to one
l
Many to many
Database System Concepts - 6th Edition
7.11
©Silberschatz, Korth and Sudarshan
Mapping Cardinalities
One to many
One to one
Note: Some elements in A and B may not be mapped to any
elements in the other set
Database System Concepts - 6th Edition
7.12
©Silberschatz, Korth and Sudarshan
Mapping Cardinalities
Many to
one
Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
Database System Concepts - 6th Edition
7.13
©Silberschatz, Korth and Sudarshan
Keys
n A super key of an entity set is a set of one or more attributes
whose values uniquely determine each entity.
n A candidate key of an entity set is a minimal super key
l
ID is candidate key of instructor
l
course_id is candidate key of course
n Although several candidate keys may exist, one of the candidate
keys is selected to be the primary key.
Database System Concepts - 6th Edition
7.14
©Silberschatz, Korth and Sudarshan
Keys for Relationship Sets
n The combination of primary keys of the participating entity sets
forms a super key of a relationship set.
l
(s_id, i_id) is the super key of advisor
l
NOTE: this means a pair of entity sets can have at most one
relationship in a particular relationship set.
4
Example: if we wish to track multiple meeting dates between
a student and her advisor, we cannot assume a relationship
for each meeting. We can use a multivalued attribute
though
n Must consider the mapping cardinality of the relationship set when
deciding what are the candidate keys
n Need to consider semantics of relationship set in selecting the
primary key in case of more than one candidate key
Database System Concepts - 6th Edition
7.15
©Silberschatz, Korth and Sudarshan
Redundant Attributes
n Suppose we have entity sets
l
instructor, with attributes including dept_name
l
department
and a relationship
l
inst_dept relating instructor and department
n Attribute dept_name in entity instructor is redundant since there is an
explicit relationship inst_dept which relates instructors to departments
l
The attribute replicates information present in the relationship, and
should be removed from instructor
l
BUT: when converting back to tables, in some cases the attribute
gets reintroduced, as we will see.
Database System Concepts - 6th Edition
7.16
©Silberschatz, Korth and Sudarshan
E-R Diagrams
n Rectangles represent entity sets.
n Diamonds represent relationship sets.
n Attributes listed inside entity rectangle
n Underline indicates primary key attributes
Database System Concepts - 6th Edition
7.17
©Silberschatz, Korth and Sudarshan
Entity With Composite, Multivalued, and Derived
Attributes
Database System Concepts - 6th Edition
7.18
©Silberschatz, Korth and Sudarshan
Relationship Sets with Attributes
Database System Concepts - 6th Edition
7.19
©Silberschatz, Korth and Sudarshan
Roles
n Entity sets of a relationship need not be distinct
l
Each occurrence of an entity set plays a “role” in the relationship
n The labels “course_id” and “prereq_id” are called roles.
Database System Concepts - 6th Edition
7.20
©Silberschatz, Korth and Sudarshan
Cardinality Constraints
n We express cardinality constraints by drawing either a directed line
(®), signifying “one,” or an undirected line (—), signifying “many,”
between the relationship set and the entity set.
n One-to-one relationship:
l A student is associated with at most one instructor via the
relationship advisor
l A student is associated with at most one department via
stud_dept
Database System Concepts - 6th Edition
7.21
©Silberschatz, Korth and Sudarshan
One-to-One Relationship
n one-to-one relationship between an instructor and a student
l
an instructor is associated with at most one student via advisor
l
and a student is associated with at most one instructor via
advisor
Database System Concepts - 6th Edition
7.22
©Silberschatz, Korth and Sudarshan
One-to-Many Relationship
n one-to-many relationship between an instructor and a student
l
an instructor is associated with several (including 0) students
via advisor
l
a student is associated with at most one instructor via advisor,
Database System Concepts - 6th Edition
7.23
©Silberschatz, Korth and Sudarshan
Many-to-One Relationships
n In a many-to-one relationship between an instructor and a student,
l
an instructor is associated with at most one student via
advisor,
l
and a student is associated with several (including 0)
instructors via advisor
Database System Concepts - 6th Edition
7.24
©Silberschatz, Korth and Sudarshan
Many-to-Many Relationship
n An instructor is associated with several (possibly 0) students via
advisor
n A student is associated with several (possibly 0) instructors via
advisor
Database System Concepts - 6th Edition
7.25
©Silberschatz, Korth and Sudarshan
Participation of an Entity Set in a
Relationship Set
n Total participation (indicated by double line): every entity in the
entity set participates in at least one relationship in the relationship
set
l
E.g., participation of section in sec_course is total
4
every section must have an associated course
n Partial participation: some entities may not participate in any
relationship in the relationship set
l
Example: participation of instructor in advisor is partial
Database System Concepts - 6th Edition
7.26
©Silberschatz, Korth and Sudarshan
Alternative Notation for Cardinality Limits
n Cardinality limits can also express participation constraints
Database System Concepts - 6th Edition
7.27
©Silberschatz, Korth and Sudarshan
E-R Diagram with a Ternary Relationship
Database System Concepts - 6th Edition
7.28
©Silberschatz, Korth and Sudarshan
Cardinality Constraints on Ternary
Relationship
n We allow at most one arrow out of a ternary (or greater degree)
relationship to indicate a cardinality constraint
n E.g., an arrow from proj_guide to instructor indicates each student has
at most one guide for a project
n If there is more than one arrow, there are two ways of defining the
meaning.
l
E.g., a ternary relationship R between A, B and C with arrows to B
and C could mean
1. each A entity is associated with a unique entity from B and C or
2. each pair of entities from (A, B) is associated with a unique C
entity, and each pair (A, C) is associated with a unique B
l
Each alternative has been used in different formalisms
l
To avoid confusion we outlaw more than one arrow
Database System Concepts - 6th Edition
7.29
©Silberschatz, Korth and Sudarshan
Weak Entity Sets
n An entity set that does not have a primary key is referred to as a
weak entity set.
n The existence of a weak entity set depends on the existence of a
identifying entity set
l
It must relate to the identifying entity set via a total, one-to-many
relationship set from the identifying to the weak entity set
l
Identifying relationship depicted using a double diamond
n The discriminator (or partial key) of a weak entity set is the set of
attributes that distinguishes among all the entities of a weak entity
set for a given strong entity (e.g., insured childrens’ first name)
n The primary key of a weak entity set is formed by the primary key of
the strong entity set on which the weak entity set is existence
dependent, plus the weak entity set’s discriminator.
Database System Concepts - 6th Edition
7.30
©Silberschatz, Korth and Sudarshan
Weak Entity Sets (Cont.)
n We underline the discriminator of a weak entity set with a dashed
line.
n We put the identifying relationship of a weak entity in a double
diamond.
n Primary key for section – (course_id, sec_id, semester, year)
Database System Concepts - 6th Edition
7.31
©Silberschatz, Korth and Sudarshan
Weak Entity Sets (Cont.)
n Note: the primary key of the strong entity set is not explicitly stored
with the weak entity set, since it is implicit in the identifying
relationship.
n If course_id were explicitly stored, section could be made a strong
entity, but then the relationship between section and course would
be duplicated by an implicit relationship defined by the attribute
course_id common to course and section
Database System Concepts - 6th Edition
7.32
©Silberschatz, Korth and Sudarshan
E-R Diagram for a University Enterprise
Database System Concepts - 6th Edition
7.33
©Silberschatz, Korth and Sudarshan
Symbols used in the E-R notation.
Database System Concepts - 6th Edition
7.34
©Silberschatz, Korth and Sudarshan
Reduction to Relational Schemas
Database System Concepts - 6th Edition
7.35
©Silberschatz, Korth and Sudarshan
Reduction to Relation Schemas
n Entity sets and relationship sets can be expressed uniformly as
relation schemas that represent the contents of the database.
n A database which conforms to an E-R diagram can be represented by
a collection of schemas.
n For each entity set and relationship set there is a unique schema that
is assigned the name of the corresponding entity set or relationship
set.
n Each schema has a number of columns (generally corresponding to
attributes), which have unique names.
Database System Concepts - 6th Edition
7.36
©Silberschatz, Korth and Sudarshan
Representing Entity Sets With Simple
Attributes
n A strong entity set reduces to a schema with the same attributes
student(ID, name, tot_cred)
n A weak entity set becomes a table that includes a column for the primary
key of the identifying strong entity set
section ( course_id, sec_id, sem, year )
Database System Concepts - 6th Edition
7.37
©Silberschatz, Korth and Sudarshan
Representing Relationship Sets
n A many-to-many relationship set is represented as a schema with
attributes for the primary keys of the two participating entity sets, and any
descriptive attributes of the relationship set.
n Example: schema for relationship set advisor
advisor = (s_id, i_id)
Database System Concepts - 6th Edition
7.38
©Silberschatz, Korth and Sudarshan
Redundancy of Schemas
n Many-to-one and one-to-many relationship sets that are total on the
many-side can be represented by adding an extra attribute to the
“many” side, containing the primary key of the “one” side
n Example: Instead of creating a schema for relationship set inst_dept,
add an attribute dept_name to the schema arising from entity set
instructor
Database System Concepts - 6th Edition
7.39
©Silberschatz, Korth and Sudarshan
Redundancy of Schemas (Cont.)
n For one-to-one relationship sets, either side can be chosen to act
as the “many” side
l That is, extra attribute can be added to either of the tables
corresponding to the two entity sets
n If participation is partial on the “many” side, replacing a schema by
an extra attribute in the schema corresponding to the “many” side
could result in null values
n The schema corresponding to a relationship set linking a weak
entity set to its identifying strong entity set is redundant.
l
Example: The section schema already contains the attributes
that would appear in the sec_course schema
Database System Concepts - 6th Edition
7.40
©Silberschatz, Korth and Sudarshan
Composite and Multivalued Attributes
n Composite attributes are flattened out by creating a
separate attribute for each component attribute
l
Example: given entity set instructor with
composite attribute name with component
attributes first_name and last_name the schema
corresponding to the entity set has two attributes
name_first_name and name_last_name
4
Prefix omitted if there is no ambiguity
n Ignoring multivalued attributes, extended instructor
schema is
l
Database System Concepts - 6th Edition
instructor(ID,
first_name, middle_initial, last_name,
street_number, street_name,
apt_number, city, state, zip_code,
date_of_birth)
7.41
©Silberschatz, Korth and Sudarshan
Composite and Multivalued Attributes
n A multivalued attribute M of an entity E is represented by a separate
schema EM
l
Schema EM has attributes corresponding to the primary key of E
and an attribute corresponding to multivalued attribute M
l
Example: Multivalued attribute phone_number of instructor is
represented by a schema:
inst_phone= ( ID, phone_number)
l
Each value of the multivalued attribute maps to a separate tuple of
the relation on schema EM
4
For example, an instructor entity with primary key 22222 and
phone numbers 456-7890 and 123-4567 maps to two tuples:
(22222, 456-7890) and (22222, 123-4567)
Database System Concepts - 6th Edition
7.42
©Silberschatz, Korth and Sudarshan
Multivalued Attributes (Cont.)
n Special case:entity time_slot has only one attribute other than the
primary-key attribute, and that attribute is multivalued
l
Optimization: Don’t create the relation corresponding to the entity,
just create the one corresponding to the multivalued attribute
l
time_slot(time_slot_id, day, start_time, end_time)
l
Caveat: time_slot attribute of section (from sec_time_slot) cannot be
a foreign key due to this optimization
Database System Concepts - 6th Edition
7.43
©Silberschatz, Korth and Sudarshan
Design Issues
n Use of entity sets vs. attributes
n Use of phone as an entity allows extra information about phone numbers
(plus multiple phone numbers)
Database System Concepts - 6th Edition
7.44
©Silberschatz, Korth and Sudarshan
Design Issues
n Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe an action
that occurs between entities
Database System Concepts - 6th Edition
7.45
©Silberschatz, Korth and Sudarshan
Design Issues
n Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary, for n > 2)
relationship set by a number of distinct binary relationship sets, a n-ary
relationship set shows more clearly that several entities participate in a
single relationship.
n Placement of relationship attributes
e.g., attribute date as attribute of advisor or as attribute of student
Database System Concepts - 6th Edition
7.46
©Silberschatz, Korth and Sudarshan
Binary Vs. Non-Binary Relationships
n Some relationships that appear to be non-binary may be better
represented using binary relationships
l
E.g., A ternary relationship parents, relating a child to his/her
father and mother, is best replaced by two binary relationships,
father and mother
4
l
Using two binary relationships allows partial information (e.g.,
only mother being know)
But there are some relationships that are naturally non-binary
4
Example: proj_guide
Database System Concepts - 6th Edition
7.47
©Silberschatz, Korth and Sudarshan
Converting Non-Binary Relationships to Binary Form
n In general, any non-binary relationship can be represented using
binary relationships by creating an artificial entity set.
l Replace R between entity sets A, B and C by an entity set E, and
three relationship sets:
1. RA, relating E and A
2. RB, relating E and B
3. RC, relating E and C
l Create a special identifying attribute for E
l Add any attributes of R to E
l For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E
2. add (ei , ai ) to RA
3. add (ei , bi ) to RB
4. add (ei , ci ) to RC
Database System Concepts - 6th Edition
7.48
©Silberschatz, Korth and Sudarshan
Converting Non-Binary Relationships
(Cont.)
n Also need to translate constraints
l
Translating all constraints may not be possible
l
There may be instances in the translated schema that
cannot correspond to any instance of R
4
l
Exercise: add constraints to the relationships RA, RB and
RC to ensure that a newly created entity corresponds to
exactly one entity in each of entity sets A, B and C
We can avoid creating an identifying attribute by making E a
weak entity set (described shortly) identified by the three
relationship sets
Database System Concepts - 6th Edition
7.49
©Silberschatz, Korth and Sudarshan
Extended ER Features
Database System Concepts - 6th Edition
7.50
©Silberschatz, Korth and Sudarshan
Extended E-R Features: Specialization
n Top-down design process; we designate subgroupings within an entity set
that are distinctive from other entities in the set.
n These subgroupings become lower-level entity sets that have attributes or
participate in relationships that do not apply to the higher-level entity set.
n Depicted by a triangle component labeled ISA (E.g., instructor “is a”
person).
n Attribute inheritance – a lower-level entity set inherits all the attributes
and relationship participation of the higher-level entity set to which it is
linked.
Database System Concepts - 6th Edition
7.51
©Silberschatz, Korth and Sudarshan
Specialization Example
Database System Concepts - 6th Edition
7.52
©Silberschatz, Korth and Sudarshan
Extended ER Features: Generalization
n A bottom-up design process – combine a number of entity sets
that share the same features into a higher-level entity set.
n Specialization and generalization are simple inversions of each
other; they are represented in an E-R diagram in the same way.
n The terms specialization and generalization are used
interchangeably.
Database System Concepts - 6th Edition
7.53
©Silberschatz, Korth and Sudarshan
Specialization and Generalization (Cont.)
n Can have multiple specializations of an entity set based on different
features.
n E.g., permanent_employee vs. temporary_employee, in addition to
instructor vs. secretary
n Each particular employee would be
l
a member of one of permanent_employee or temporary_employee,
l
and also a member of one of instructor, secretary
n The ISA relationship also referred to as superclass - subclass
relationship
Database System Concepts - 6th Edition
7.54
©Silberschatz, Korth and Sudarshan
Design Constraints on a
Specialization/Generalization
n Constraint on which entities can be members of a given lower-level entity
set.
l
condition-defined
4
l
Example: all customers over 65 years are members of seniorcitizen entity set; senior-citizen ISA person.
user-defined
n Constraint on whether or not entities may belong to more than one lower-
level entity set within a single generalization.
l
l
Disjoint
4
an entity can belong to only one lower-level entity set
4
Noted in E-R diagram by having multiple lower-level entity sets link
to the same triangle
Overlapping
4
an entity can belong to more than one lower-level entity set
Database System Concepts - 6th Edition
7.55
©Silberschatz, Korth and Sudarshan
Design Constraints on a
Specialization/Generalization (Cont.)
n Completeness constraint -- specifies whether or not an entity in
the higher-level entity set must belong to at least one of the lowerlevel entity sets within a generalization.
l
total: an entity must belong to one of the lower-level entity sets
4
l
adding the keyword “total” in the diagram
partial: an entity need not belong to one of the lower-level
entity sets
Database System Concepts - 6th Edition
7.56
©Silberschatz, Korth and Sudarshan
Aggregation
n Consider the ternary relationship proj_guide, which we saw earlier
n Suppose we want to record evaluations of a student by a guide on a
project
Database System Concepts - 6th Edition
7.57
©Silberschatz, Korth and Sudarshan
Aggregation (Cont.)
n Relationship sets eval_for and proj_guide represent overlapping
information
l
Every eval_for relationship corresponds to a proj_guide
relationship
l
However, some proj_guide relationships may not correspond to
any eval_for relationships
4
So we can’t discard the proj_guide relationship
n Eliminate this redundancy via aggregation
l
Treat relationship as an abstract entity
l
Allows relationships between relationships
l
Abstraction of relationship into new entity
Database System Concepts - 6th Edition
7.58
©Silberschatz, Korth and Sudarshan
Aggregation (Cont.)
n Without introducing redundancy, the following diagram represents:
l
A student is guided by a particular instructor on a particular project
l
A student, instructor, project combination may have an associated
evaluation
Database System Concepts - 6th Edition
7.59
©Silberschatz, Korth and Sudarshan
Representing Specialization via
Schemas
n Method 1:
l
Form a schema for the higher-level entity
l
Form a schema for each lower-level entity set, include primary key
of higher-level entity set and local attributes
schema
person
student
employee
l
attributes
ID, name, street, city
ID, tot_cred
ID, salary
Drawback: getting information about, an employee requires
accessing two relations, the one corresponding to the low-level
schema and the one corresponding to the high-level schema
Database System Concepts - 6th Edition
7.60
©Silberschatz, Korth and Sudarshan
Representing Specialization as Schemas
(Cont.)
n Method 2:
l
Form a schema for each entity set with all local and inherited attributes
schema
attributes
person
ID, name, street, city
student
ID, name, street, city, tot_cred
employee
ID, name, street, city, salary
l
If specialization is total, the schema for the generalized entity set
(person) not required to store information
4
l
Can be defined as a “view” relation containing union of
specialization relations
Drawback: name, street and city may be stored redundantly for people
who are both students and employees
Database System Concepts - 6th Edition
7.61
©Silberschatz, Korth and Sudarshan
Schemas Corresponding to
Aggregation (Cont.)
n To represent aggregation, create a schema
containing
l
primary key of the aggregated relationship,
l
the primary key of the associated entity set
l
any descriptive attributes
n The schema for the relationship set eval_for between
the aggregation of proj_guide and the entity set
evaluation includes:
1. An attribute for each attribute in the primary keys
of the entity set evaluation, and the relationship
set proj_guide.
2. It also includes an attribute for any descriptive
attributes, if they exist, of the relationship set
eval_for.
n We then transform the relationship sets and entity
sets within the aggregated entity set following the
rules we have already defined.
Database System Concepts - 6th Edition
7.62
©Silberschatz, Korth and Sudarshan
Summary of E-R Design Decisions
n The use of an attribute or entity set to represent an object.
n Whether a real-world concept is best expressed by an entity set or
a relationship set.
n The use of a ternary relationship versus a pair of binary
relationships.
n The use of a strong or weak entity set.
n The use of specialization/generalization – contributes to modularity
in the design.
n The use of aggregation – can treat the aggregate entity set as a
single unit without concern for the details of its internal structure.
Database System Concepts - 6th Edition
7.63
©Silberschatz, Korth and Sudarshan
Das Bild kann zurzeit nicht angezeigt werden.
End
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Das Bild kann zurzeit nicht angezeigt werden.
Relational Database Design
Normal Forms
n Normal forms defined in relational database theory represent
guidelines for record design.
n Presentation conveys an intuitive sense of the intended constraints on
record design
l
Its informality it may be imprecise in some technical details
n Normalization rules:
l
are designed to prevent update anomalies and data
inconsistencies
l
tend to penalize retrieval efficiency, since data which may have
been retrievable from one record in an unnormalized design may
have to be retrieved from several records in the normalized form
n No obligation to fully normalize all records when actual performance
requirements are taken into account
Presentation follows the article:
William Kent, "A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM
8.2
1-NF
EmpID
n First normal form deals with the
PrjID
E1
P1
"shape" of a record type
E2
P2
n Under first normal form, all
E3
occurrences of a record type must
contain the same number of fields
n First normal form excludes variable
repeating fields and groups
P2
P3
P3
Relation not in 1-NF
EmpID
PrjID
E1
P1
E1
P2
E2
P2
E2
P3
E2
P4
E3
P3
Relation in 1-NF
8.3
P4
2-NF
n Under second (and third) normal form, a
non-key field must provide a fact:
l about
l us
the key,
the whole key, and
l nothing
but the key
n In addition, the record must satisfy 1-NF
8.4
2-NF
n 2-NF is violated when a non-key field is a fact
about a subset of a key (when the key is
composite)
n Consider the following inventory schema of an
online book store:
n Inventory(BookID, Warehouse, Quantity, Warehouse-Address)
n Inventory is not in 2-NF:
l
Why? Key is composite (BookID, Warehouse) but
Warehouse-Address is a fact about Warehouse alone
8.5
2-NF
n Problems by violating 2-NF:
l The warehouse address is repeated in every
record that refers to a book stored in that
warehouse
l If the address of the warehouse changes, every
record referring to a book stored in that
warehouse must be updated
l Because of the redundancy, the data might
become inconsistent, with different records
showing different addresses for the same
warehouse.
l If at some point in time there are no books stored
in the warehouse, there may be no record in
which to keep the warehouse's address
8.6
2-NF
n To satisfy 2-NF, the schema:
Inventory(BookID, Warehouse, Quantity, Warehouse-Address)
should be decomposed into (replaced by) the
two records:
l Stocking(BookID,
Warehouse, Quantity) and
l Warehouse(Warehouse,
Warehouse-Address)
n When replacing unnormalized schemas with
normalized schemas, the process is referred
to as normalization (in this case: 2-NF
normalization)
8.7
2-NF
n Normalized design enhances the
integrity of the data, by minimizing
redundancy and inconsistency
n But at performance cost for retrieval
l Assume
we want the addresses of all
warehouses stocking a certain book:
4In
the unnormalized form we searches one
table
4With
the normalized design we have to join two
tables and search the appropriate pairs
8.8
3-NF
n 3-NF is violated when a non-key field is
a fact about another non-key field
n Consider the schema
Works(EmpID, DepartmentID, Location)
l EmpID
l Each
is the primary key
department is located in one place
l Location
field is (in addition to EmpID)
also a fact about the DepartmentID, which
is not the key
8.9
3-NF
n Problems by violating 3-NF:
l The department's location is repeated in the
record of every employee assigned to that
department
l If the location of the department changes,
every such record must be updated
l Because of the redundancy, the data might
become inconsistent, with different records
showing different locations for the same
department
l If a department has no employees, there may
be no record in which to keep the
department's location
8.10
3-NF
n To satisfy 3-NF the schema
Works(EmpID, Department, Location)
should be decomposed into the two records:
Works(EmpID, DepartmentID)
Department(DepartmentID, Location)
n The 2 schemas are in 2-NF and 3-NF,
because every field is either:
l
part of the key or
l
provides a (single-valued) fact about exactly the
whole key and nothing else
8.11
Functional Dependencies
n In relational database theory, 2-NF and 3-NF
are defined in terms of functional dependencies
n A field Y is "functionally dependent" on a field
(or fields) X, if it is invalid to have two records
with the same X-value but different Y-values
l
a given X-value must always occur with the same Yvalue
n When X is a key, then all fields are by definition
functionally dependent on X in a trivial way
n 2-NF and 3-NF do not allow any functional
dependencies in all other (non-trivial) cases
8.12
Functional Dependencies
n Functional dependencies only exist when the things
involved have unique and singular identifiers
n Example:
l
Suppose a person has only one address
l
If we don't provide unique identifiers for people, then there
will not be a functional dependency:
Person
Address
John Smith
123 Main St., New York
John Smith
321 Center St., San Francisco
l
Although each person has a unique address, a given name
can appear with several different addresses (different
persons with same name)
l
Non unique identifier precludes functional dependency
8.13
Functional Dependencies
n Another example:
l
the address has to be spelled identically (i.e., be
unique as identifier)
Person
Address
John Smith
123 Main St., New York
John Smith
123 Main Street, NYC
l
The same person appears to be living at two
different addresses
l
Non unique identifier precludes a functional
dependency
8.14
Functional Dependencies
n Therefore, even when we assume that
Employee is uniquely identified by name
(reasonable for small firms), the instance of
relation:
Employee Father
Father’s Address
Art Smith
John Smith
123 Main St., New York
Bob Smith
John Smith
123 Main Street, NYC
Cal Smith
John Smith
321 Center St., San Francisco
does not violate 3-NF
l
Father’s cannot be assumed as unique identifier
l
Father’s address is not a unique identifier due to
misspellings
8.15
4-NF and 5-NF
n 4-NF and 5-NF deal with multi-valued:
l may
correspond to a many-to-many
relationship
4E.g.,
employees and skills (an employee may
have many skills)
l or
to a many-to-one relationship
4E.g.,
the children of an employee (assuming
only one parent is an employee)
8.16
4-NF
n Under 4-NF, a schema should not
contain two or more independent
multi-valued facts about an entity
n In addition, the schema must satisfy 3-
NF
l The
term "independent" will be defined in
the next slide
8.17
4-NF
n Example schema:
l
Employees, skills, and languages, where an
employee may have several skills and speak
several languages
ESL(Emp, Skill, Lang)
n ESL violates 4-NF
n Why? Skill and Lang are independent
l
A skill of an employee does not depend
(no direct connection) in any way on any language
l
only an indirect connection because they belong to
some common employee
8.18
4-NF
n Problem by violating 4-NF: leads to
uncertainties in the relational representation
Emp
Skill
Smith
Smith
Language
Emp
Skill
Language
Emp
Skill
Language
cook
Smith
cook
French
Smith
cook
French
speak
Smith
speak
German
Smith
speak
German
Smith
speak
Spanish
Smith
null
Spanish
Smith
French
Smith
German
Spanish
disjoint format
Minimal number of records
with repetitions
Emp
Skill
Language
Smith
cook
French
Smith
cook
German
Smith
cook
Spanish
Smith
speak
French
Smith
speak
German
Smith
speak
Spanish
8.19
Minimal number of records
with null values
A "cross-product" form
4-NF
n Other problems caused by violating 4-NF:
l If there are repetitions, then updates have to be done
in multiple records, and they could become
inconsistent
l Insertion of a new skill may involve looking for a
record with a blank skill, or inserting a new record with
a possibly blank language, or inserting multiple
records pairing the new skill with some or all of the
languages
l Deletion of a skill may involve blanking out the skill
field in one or more records (perhaps with a check
that this doesn't leave two records with the same
language and a blank skill), or deleting one or more
records, coupled with a check that the last mention of
some language hasn't also been deleted
8.20
4-NF
n 4-NF minimizes such update problems
n Decompose
ESL(Emp, Skill, Lang)
into
ES(Emp, Skill) and EL(Emp, Lang)
8.21
4-NF
n What about ternary
relationships? Does 4-NF
means that we have to always
decompose into 2-way
relationships?
Emp
Skill
Language
Smith
cook
French
Smith
speak
German
Smith
speak
Spanish
n No!
Ternary relationship does not
violate 4-NF
Skill and Language are not
independent
n In a ternary relationship the
facts are not independent
n Assume there is direct
connection between skill and
language
l
Skill is performed in a specific
language
4
E.g., cook French cuisine
8.22
5-NF
n 5-NF deals with cases where
information can be reconstructed from
smaller pieces of information that can
be maintained with less redundancy
n 2-NF, 3-NF, and 4-NF also serve this
purpose, but 5-NF generalizes to cases
not covered by the others
n No comprehensive exposition, but
illustrate central concept with example
8.23
5-NF
n Example:
l
agents represent companies
l
companies make products
l
agents sell products
l
record which agent sells which product for which company
l
Agent
Comp
Product
Smith
Ford
Car
Smith
GM
Truck
Notice that:
4
Smith does not sell Ford trucks or GM cars
4
need the combination of three fields to know which combinations
are valid
8.24
5-NF
n Assume the rule:
l
if an agent sells a certain product type,
l
and he represents a company making that product type,
l
then he sells the products of this type made by this company
n Example facts:
Agent
Comp
Product
l
Ford and GM produce cars and trucks
Smith
Ford
Car
l
Smith sells cars and trucks,
Jones sells only cars
Smith
Ford
Truck
Smith
GM
Car
Smith represents Ford and GM,
Jones represents Ford
Smith
GM
Truck
Jones
Ford
Car
l
8.25
5-NF
n But we can reconstruct all the true facts from a normalized form
consisting of three separate schemas, each containing two fields:
Smith represents Ford and GM,
Jones represents Ford
Ford and GM produce
cars and trucks
Smith sells cars and trucks,
Jones sells only cars
Comp
Product
Agent
Product
Agent
Comp
Ford
Car
Smith
Car
Smith
Ford
Ford
Truck
Smith
Truck
Smith
GM
GM
Car
Jones
Car
Jones
Ford
GM
Truck
n These three schemas are in 5-NF, whereas the corresponding three-
field schema (previous slide) is not
n A schema is in 5-NF when its information content cannot be
reconstructed from schemas each having fewer fields (exclude the
case where all smaller schemas have the same key)
8.26
5-NF
n Notice: 5-NF does not differ from 4-NF unless there
exists a symmetric constraint (such as the rule about
agents, companies, and products)
l
when no such a constraint, a schema in 4-NF is always in
5-NF also
n Advantage of 5-NF:
l
certain redundancies can be eliminated
l
the fact that Smith sells cars is recorded only once;
l
in the unnormalized form it may be repeated many times
Agent
Product
Agent
Comp
Product
Smith
Car
Smith
Ford
Car
Smith
Truck
Smith
Ford
Truck
Jones
Car
Smith
GM
Car
Smith
GM
Truck
Jones
Ford
Car
8.27
Exercise
n The Denormalized table
l
stores data for products purchased by people online
l
This database also stores their employer information
4 assume
that a person can only have one employer
SSN
User Product1
Name
332345432 Amy M
Product2 More
Products
Employer
Name
Google
Employer
Address
1 California drive
666666666 Kevin A
919919919 Raj
D
B
Facebook
Google
22nd Street Sanfrancisco
1 California drive
C,D
Database Normalization Tutorial with example
http://dotnetanalysis.blogspot.de/2012/01/database-normalization-sql-server.html
8.28
Exercise: 1-NF
n Only one value in a column
n No multiple columns for a one-to-many
relationship
n Which problems do you see in the previous table?
SSN
User
Name
332345432 Amy
Employer
Name
Google
Employer
Address
1 California
drive
Product
666666666 Kevin
Facebook
A
666666666 Kevin
Facebook
666666666 Kevin
Facebook
666666666 Kevin
Facebook
22nd Street
Sanfrancisco
22nd Street
Sanfrancisco
22nd Street
Sanfrancisco
22nd Street
Sanfrancisco
919919919 Raj
Google
1 California
drive
D
M
B
C
D
SSN and Product together have been chosen as the primary key
8.29
Exercise: 2-NF
n All the non primary key columns in the table should depend on the
entire primary key
n Which problems do you see in the previous table?
l
The UserName column does not depend on the entire primary key. It
only depends on a part of the primary key (SSN)
l
EmployerName and EmployerAddress does not depend on the entire
primary key. They only depend on a part of the primary key (SSN)
SSN
332345432
666666666
919919919
User
Name
Amy
Kevin
Raj
SSN
Employer
Name
332345432 Google
666666666 Facebook
919919919 Google
Employer
Address
1 California drive
22nd Street
Sanfrancisco
1 California drive
SSN
332345432
666666666
666666666
666666666
666666666
919919919
In 2-NF every column is dependent on the entire primary key in that table and
not part of the primary key
8.30
Product
M
A
B
C
D
D
Exercise: 3-NF
n No indirect dependency between non-key fields
n Which problems do you see in the previous tables?
l
SSN
332345432
666666666
919919919
EmployerAddress depends on EmployerName
User
Name
Amy
Kevin
Raj
SSN
33234543
2
66666666
6
91991991
9
Employer
Name
Google
Employer
Name
Google
Facebook
Facebook
Google
8.31
Employer
Address
1 California drive
22nd Street
Sanfrancisco
SSN
332345432
666666666
666666666
666666666
666666666
919919919
Product
M
A
B
C
D
D
Summary of Design Process
n An initial set of data elements and records has to be developed,
as candidates for normalization
n Then the factors affecting normalization have to be assessed:
l
Single-valued vs. multi-valued facts
l
Dependency on the entire key
l
Independent vs. dependent facts
l
The presence of mutual constraints
l
The presence of non-unique or non-singular representations
l
And, finally, the desirability of normalization has to be
assessed, in terms of its performance impact on retrieval
applications
8.32
Das Bild kann zurzeit nicht angezeigt werden.
End
Download