Links - NETLAB - University of Kentucky

advertisement
CS 405G: Introduction to
Database Systems
Instructor: Jinze Liu
Fall 2009
Review

A data model is


What are the two terms used by ER model to
describe a miniworld?



a group of concepts for describing data.
Entity
Relationship
Student (sid: string, name:
string, login: string, age: integer,
gpa:real)
What makes a good database design
4/13/2015
10101
11101
2
Topics Next


More case study
Conversion of ER models to Schemas
4/13/2015
3
ER Case study
Design a database representing cities, counties, and states




For states, record name and capital (city)
For counties, record name, area, and location (state)
For cities, record name, population, and location (county and
state)
Assume the following:






4
4/13/2015
Names of states are unique
Names of counties are only unique within a state
Names of cities are only unique within a county
A city is always located in a single county
A county is always located in a single state
4/13/2015
4
Case study : first design

County area information is repeated for every city in
the county



Redundancy is bad.
What else?
State capital should really be a city

Should “reference” entities through explicit relationships
name
name
Cities
population
In
States
capital
county_name
county_area
4/13/2015
5
Case study : second design

Technically, nothing in this design could prevent a city in
state X from being the capital of another state Y, but oh
well…
name
Cities
population
In
IsCapitalOf
name
Counties
In
States
name
area
4/13/2015
6
Database Design
4/13/2015
7
A Relation is a Table
Attributes
(column
headers)
Tuples
(rows)
8
name
Winterbrew
Bud Lite
manf
Pete’s
Anheuser-Busch Beers
Schemas
Relation schema = relation name + attributes, in order
(+ types of attributes).


Example: Beers(name, manf) or Beers(name: string, manf:
string)
Database = collection of relations.
Database schema = set of all relation schemas in the
database.


9
Why Relations?



Very simple model.
Often matches how we think about data.
Abstract model that underlies SQL, the most important
database language today.

10
But SQL uses bags, while the relational model is a set-based
model.
From E/R Diagrams to Relations


Entity sets become relations with the same set of
attributes.
Relationships become relations whose attributes are only:


11
The keys of the connected entity sets.
Attributes of the relationship itself.
Entity Set -> Relation
name
manf
Beers
Relation: Beers(name, manf)
12
Relationship -> Relation
name
husband
addr
Drinkers
1
name
Likes
manf
Beers
2
Buddies
Favorite
wife
Married
Likes(drinker, beer)
Favorite(drinker, beer)
Buddies(name1, name2)
Married(husband, wife)
13
Combining Relations


It is OK to combine the relation for an entity-set E with
the relation R for a many-one relationship from E to
another entity set.
Example: Drinkers(name, addr) and Favorite(drinker, beer)
combine to make Drinker1(name, addr, favBeer).
14
Risk with Many-Many Relationships

Combining Drinkers with Likes would be a mistake. It
leads to redundancy, as:
name
addr
beer
Sally 123 Maple Bud
Sally 123 Maple Miller
Redundancy
15
Handling Weak Entity Sets


Relation for a weak entity set must include attributes for
its complete key (including those belonging to other
entity sets), as well as its own, nonkey attributes.
A supporting (double-diamond) relationship is redundant
and yields no relation.
16
Example
name
time
17
Logins
name
At
Hosts
Example
name
time
Logins
name
At
Hosts
Hosts(hostName)
Logins(loginName, hostName, time)
At(loginName, hostName, hostName2)
At becomes part of
Logins
18
Must be the same
A (Slightly) Formal Definition



A database is a collection of relations (or tables)
Each relation is identified by a name and a list of
attributes (or columns)
Each attribute has a name and a domain (or type)

Set-valued attributes not allowed
19
Schema versus instance

Schema (metadata)




Specification of how data is to be structured logically
Defined at set-up
Rarely changes
Instance


Content
Changes rapidly, but always conforms to the schema
 Compare
to type and objects of type in a programming
language
4/13/2015
20
Example

Schema




Student (SID integer, name string, age integer, GPA float)
Course (CID string, title string)
Enroll (SID integer, CID integer)
Instance



4/13/2015
{ h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...}
{ hCPS116, Intro. to Database Systemsi, ...}
{ h142, CPS116i, h142, CPS114i, ...}
21
Relational Integrity Constraints
Constraints are conditions that must hold on all valid
relation instances. There are four main types of
constraints:

1.
Domain constraints
1.
2.
3.
4.
4/13/2015
The value of an attribute must come from its domain
Key constraints
Entity integrity constraints
Referential integrity constraints
22
Primary Key Constraints

A set of fields is a candidate key for a relation if :
1. No two distinct tuples can have same values in all key fields,
and
2. This is not true for any subset of the key.
 Part 2 false? A superkey.
 If there’s >1 key for a relation, one of the keys is chosen (by
DBA) to be the primary key.

E.g., given a schema Student(sid: string, name: string, gpa:
float) we have:

4/13/2015
sid is a key for Students. (What about name?) The set {sid,
gpa} is a superkey.
Jinze Liu @ University of Kentucky
23
Key Example

CAR (licence_num: string, Engine_serial_num: string,
make: string, model: string, year: integer)



4/13/2015
What is the candidate key(s)
Which one you may use as a primary key
What are the super keys
24
Entity Integrity

Entity Integrity: The primary key attributes PK of each
relation schema R in S cannot have null values in any tuple
of r(R).

4/13/2015
Other attributes of R may be similarly constrained to
disallow null values, even though they are not members of the
primary key.
25
Foreign Keys, Referential Integrity


Foreign key : Set of fields in one relation that is used to
`refer’ to a tuple in another relation. (Must correspond
to primary key of the second relation.) Like a `logical
pointer’.
E.g. sid is a foreign key referring to Students:




Student(sid: string, name: string, gpa: float)
Enrolled(sid: string, cid: string, grade: string)
If all foreign key constraints are enforced, referential
integrity is achieved, i.e., no dangling references.
Can you name a data model w/o referential integrity?

4/13/2015
Links in HTML!
Jinze Liu @ University of Kentucky
26
Foreign Keys

Only students listed in the Students relation should be
allowed to enroll for courses.
Enrolled
sid
53666
53666
53650
53666

cid
Carnatic101
Reggae203
Topology112
History105
grade
C
B
A
B
Students
sid
53666
53688
53650
name
login
Jones jones@cs
Smith smith@eecs
Smith smith@math
age
18
18
19
gpa
3.4
3.2
3.8
Or, use NULL as the value for the foreign key in the
referencing tuple when the referenced tuple does not
exist
4/13/2015
Jinze Liu @ University of Kentucky
27
In-Class Exercise
(Taken from Exercise 5.16)
Consider the following relations for a database that keeps track of student
enrollment in courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign keys for this schema.
4/13/2015
28
In-Class Exercise
(Taken from Exercise 5.16)
Consider the following relations for a database that keeps track of
student enrollment in courses and the books adopted for each
course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign keys
for
29 this schema.
Jinze Liu @ University of Kentucky 4/13/2015
Other Types of Constraints

Semantic Integrity Constraints:




4/13/2015
based on application semantics and cannot be expressed by
the model per se
e.g., “the max. no. of hours per employee for all projects he
or she works on is 56 hrs per week”
A constraint specification language may have to be used to
express these
SQL-99 allows triggers and ASSERTIONS to allow for some
of these
30
Update Operations on Relations

Update operations




31
INSERT a tuple.
DELETE a tuple.
MODIFY a tuple.
Constraints should not be violated in updates
Jinze Liu @ University of Kentucky
4/13/2015
Example

We have the following relational schemas





Student(sid: string, name: string, gpa: float)
Course(cid: string, department: string)
Enrolled(sid: string, cid: string, grade: character)
We have the following sequence of database update
operations. (assume all tables are empty before we apply
any operations)
INSERT<‘1234’, ‘John Smith’, ‘3.5> into Student
sid
1234
32
name
John Smith
gpa
3.5
Jinze Liu @ University of Kentucky
4/13/2015
Example (Cont.)




INSERT<‘647’, ‘EECS’>
into Courses
INSERT<‘1234’, ‘647’,
‘B’> into Enrolled
UPDATE the grade in
the Enrolled tuple with
sid = 1234 and cid =
647 to ‘A’.
DELETE the Enrolled
tuple with sid 1234 and
cid 647
33
sid
1234
name
John Smith
cid
647
department
EECS
sid
cid
grade
1234
647
A
B
Jinze Liu @ University of Kentucky
4/13/2015
gpa
3.5
Exercise



INSERT<‘108’,
‘MATH’> into Courses
INSERT<‘1234’, ‘108’,
‘B’> into Enrolled
INSERT<‘1123’, ‘Mary
Carter’, ‘3.8’> into
Student
34
sid
1234
name
John Smith
gpa
3.5
1123
Mary Carter
3.8
cid
647
108
department
EECS
MATH
sid
1234
cid
108
Jinze Liu @ University of Kentucky
grade
B
4/13/2015
Exercise (cont.)


A little bit tricky
INSERT<‘1125’, ‘Bob
Lee’, ‘good’> into
Student


INSERT<‘1123’, NULL,
‘B’> into Enrolled


Fail due to domain
constraint
Fail due to entity
integrity
INSERT <‘1233’,’647’,
‘A’> into Enrolled

Failed due to
referential integrity
35
sid
1234
name
John Smith
gpa
3.5
1123
Mary Carter
3.8
cid
647
department
EECS
108
MATH
sid
1234
cid
108
Jinze Liu @ University of Kentucky
grade
B
4/13/2015
Exercise (cont.)


A more tricky one
UPDATE the cid in the
tuple from Course
where cid = 108 to
109
36
sid
1234
name
John Smith
gpa
3.5
1123
Mary Carter
3.8
cid
647
department
EECS
108
109
MATH
sid
1234
cid
108
109
Jinze Liu @ University of Kentucky
grade
B
4/13/2015
Update Operations on Relations

In case of integrity violation, several actions can be
taken:




37
Cancel the operation that causes the violation (REJECT
option)
Perform the operation but inform the user of the violation
Trigger additional updates so the violation is corrected
(CASCADE option, SET NULL option)
Execute a user-specified error-correction routine
Jinze Liu @ University of Kentucky
4/13/2015
Download