Lecture 8

advertisement
Lecture 8: Schema Refinement and Normal
Forms; Physical Design and Tuning
• Schema Refinement
– Motivation
– Anomalies, Redundancy
– Decomposition: a good
solution
– Keys and Functional
Dependencies (FDs)
– BCNF and Redundancy
– Lossless Decompositions
– Dependency Preserving
Decompositions,
Projections
– Third Normal Form
3/14/2016
Lecture 8
• Physical Design
– Performance and the
workload
– Choosing Indexes
• Identifying useful indexes,
Too many indexes, How
indexes are chosen
– More Schema Refinement
• Denormalization, Vertical
and Horizontal
Decomposition
• Tuning the database and
tuning queries
Slide 1
Learning Objectives
LO8.1: Identify update, insertion and deletion anomalies
LO8.2: Identify possible keys given an instance
LO8.3: Identify possible functional dependencies in a
relation
LO8.4: Determine all keys in a schema
LO8.5: Decompose a schema into BCNF schemas
3/14/2016
Lecture 8
Slide 2
Review
• We began the course with the life cycle of database
applications
• First came Requirements Analysis from the customer
• We learned how to transform an RA into an ER diagram
• Then we transformed ER diagrams into relational schemas
– and went on to implement the application by loading the data
and writing SQL statements
• But different ER diagrams can lead to different relational
schemas. This week we study which schemas are best.
3/14/2016
Lecture 8
Slide 3
What is Schema Refinement?
• Schema Refinement is the study of what should go where in a
DBMS, or, which schemas are best to describe an application.
• For example, consider this schema
EmpDept
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
DeptName
Wing
Tail
Wing
Wing
• Versus this one:
Emp
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
Dept
DeptID
12
10
DeptName
Wing
Tail
• Which schema do you think is best? Why?
3/14/2016
Lecture 8
Slide 4
What’s wrong?*
•
The first problem students usually identify with the
EmpDept schema is that it combines two different
ideas: employee information and department
information. But what is wrong with this?
1. If we separated the two concepts we could save space.
2. Combining the two ideas leads to some bad anomalies.
•
These two problems occur because DeptID determines
DeptName, but DeptID is not a key. Let’s look into the
anomalies further.
3/14/2016
Lecture 8
Slide 5
Anomalies, Redundancy*
EmpDept
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
DeptName
Wing
Tail
Wing
Wing
• What anomalies are associated with EmpDept?
• Update Anomalies:
• Insertion Anomalies:
• Deletion Anomalies:
3/14/2016
Lecture 8
Slide 6
LO8.1:Practice Anomalies, Redundancies*
• Identify anomalies associated with this schema.
Include update, insertion and deletion anomalies.
EnrollStud(StudID, ClassID, Grade, ProfID, StudName)
• Why do these anomalies occur?
3/14/2016
Lecture 8
Slide 7
Decomposition: A good solution
• The intergalactic standard solution to the redundancy
problem is to decompose redundant schemas, e.g.,
EmpDept becomes
Emp
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
Dept
DeptID
12
10
DeptName
Wing
Tail
• The secret to understanding when and how to
decompose schemas is Functional Dependencies,
a generalization of keys.
• When we say "X determines Y" we are stating a
functional dependency.
3/14/2016
Lecture 8
Slide 8
Review Keys
EmpDept
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
DeptName
Wing
Tail
Wing
Wing
• Note that EID being a key* of EmpDept means
that the values of EID are unique, and EID is
minimal.
• Remember: you cannot determine keys from an
instance, only from “natural” information or from a
domain expert.
• Let’s practice keys by identifying possible keys in
an instance.
*sometimes called a candidate key
3/14/2016
Lecture 8
Slide 9
LO8.2:Identify Possible Keys*
• Identify all possible Keys based on this instance:
Time
9:57AM
10:42AM
11:44AM
12:44PM
1:43PM
2:44PM
3:55PM
5:44PM
7:55PM
3/14/2016
Flight
157
233
155
244
074
233
455
120
233
Lecture 8
Plane
abc
def
des
xdy
xyz
def
eff
ikk
abf
Origin
SEA
PDX
ORD
ATL
SEA
PDX
MSP
MSP
CHI
Destination
PDX
SEA
ATL
PDX
ATL
ATL
SEA
PDX
SEA
Slide 10
Functional Dependencies
EmpDept
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
DeptName
Wing
Tail
Wing
Wing
• A key like EID has another property: If two rows have
the same EID, then they have the same value of
every other attribute. We say EID functionally
determines all other attributes and write this
Functional Dependency (FD):
EID  Name, DeptID, DeptName
• Is Name  DeptID true?
– No, because rows 2 and 3 have the same Name but not the
same DeptID.
3/14/2016
Lecture 8
Slide 11
Functional Dependencies, ctd.
EmpDept
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
DeptName
Wing
Tail
Wing
Wing
• Do you see any more FDs in EmpDept?
– Yes, the FD DeptID  DeptName
• DEFINITION: If A and B are sets of attributes in a relation, we
say that A (functionally) determines B, or AB is a
Functional Dependency (FD) if whenever two rows agree on
A, they agree on B. In other words, the value of a row on A
functionally determines its value on B.
• There are two special kinds of FDs:
– Key FDs, XA where X contains a key
– Trivial FDs, such as NameName, or Name,DeptIDDeptID
3/14/2016
Lecture 8
Slide 12
Identify (natural) FDs*
• What are the (natural) FDs in these relations? Identify the key
FDs but ignore trivial FDs
Customer(CustID, Address, City, Zip, State)
EnrollStud(StudID, ClassID, Grade, ProfID, StudName, ProfName)
3/14/2016
Lecture 8
Slide 13
What are FDs?
• An FD is a generalization of the concept of key.
• FDs, like keys and foreign keys, are a kind of integrity
constraint (IC).
• Like other ICs, FDs are part of a relation’s schema.
• For example, a schema might be:
Assigned(EmpID Int,
JobID Int,
EmpName varchar(20),
percent real,
EmpID references… , JobID references…,
PRIMARY KEY (EmpID, JobID))
FDs: EmpIDEmpName
3/14/2016
Lecture 8
Slide 14
How to determine FDs
• So far we have dealt with “natural” FDs. Sometimes
it’s not clear what FDs apply in a relation, e.g., zip
codes vs cities, or
Supplier(Name, Address, Crating, Discount) – unclear
what are the FDs.
• There are two ways to determine FDs
– Infer them as “natural” FDs from your experience
– You may be given them as part of the schema, by the
instructor or by the customer.
• As with keys, you cannot determine FDs from an
instance!
– But you can tell if something is not an FD
3/14/2016
Lecture 8
Slide 15
LO8.3:Identify Possible FDs*
• Identify two possible non-key FDs based on this
instance (identical to slide 10). Remember the
possible keys for this instance are {Time}, {Plane,
Dest}, {Origin, Dest}
Time
9:57AM
10:42AM
11:44AM
12:44PM
1:43PM
2:44PM
3:55PM
5:44PM
7:55PM
3/14/2016
Flight
157
233
155
244
074
233
455
120
233
Lecture 8
Plane
abc
def
des
xdy
xyz
def
eff
ikk
abf
Origin
SEA
PDX
ORD
ATL
SEA
PDX
MSP
MSP
CHI
Destination
PDX
SEA
ATL
PDX
ATL
ATL
SEA
PDX
SEA
Slide 16
Reasoning about FDs
EmpDept(EID, Name, DeptID, DeptName)
• Two natural FDs are
EIDDeptID and DeptIDDeptName
• These two FDs imply the FD EIDDeptName
– Because if two tuples agree on EID, then by the first FD they
agree on DeptID, then by the second FD they agree on
DeptName.
• The set of FDs implied by a given set F of FDs is called
the closure of F and is denoted F+
3/14/2016
Lecture 8
Slide 17
Armstrong’s Axioms
• The closure of F can be computed using these axioms
 Reflexivity: If X  Y, then XY
 Augmentation: If XY, then XZYZ for any Z
 Transitivity: If XY and YZ then XZ
• Armstrong’s axioms are sound (they generate only FDs
in F+ when applied to FDs in F) and complete (repeated
application of these axioms will generate all FDs in F+).
3/14/2016
Lecture 8
Slide 18
Determining Keys
• In order to determine if X is a key of a relation R, use
this algorithm, which computes the attribute closure
of X:
AttClos = X;
// Note: X is a set of attributes
Repeat until there is no change 
If there is an FD UV with U  AttClos, then set
AttClos = AttClos
∪V

AttClos=R if and only if X is a key
3/14/2016
Lecture 8
Slide 19
LO8.4:Determining the keys of R*
• Given the schema: R(A,B,C,D,E) BCA, DEC .
• What are all the keys of this schema?
• Hint: any key must include A, BC or DE. Why?
3/14/2016
Lecture 8
Slide 20
Redundancy and FDs
• Consider the FDs in these examples:
EmpDept(EID, Name, DeptID, DeptName)
Assigned(EmpID, JobID, EmpName, percent)
EnrollStud(StudID, ClassID, Grade)
• Remember that every non-key FD is associated with
some redundancy, or anomalies, and vice-versa.
• Our game plan is to use non-key FDs to decompose any
relation into a form that has no redudancy, a so-called
normal form.
3/14/2016
Lecture 8
Slide 21
Boyce-Codd Normal Form (BCNF)*
• A relation is said to be in Boyce-Codd Normal Form if
all its FDs are either trivial FDs or key FDs.
• Which of these relations is BCNF?
EmpDept(EID, Name, DeptName)
Assigned(EmpID, JobID, EmpName, percent)
EnrollStud(StudID, ClassID, grade)
• Each BCNF relation with a single key looks like this
Key
3/14/2016
Nonkey
Attr1
Lecture 8
Nonkey
Attr2
  
Nonkey
Attrk
Slide 22
BCNF and Redundancy
• Theorem: BCNF relations have no redundancy.
Proof: A relation has redundancy if there is an FD between two sets
of attributes , say DeptIDDeptName, and there can be
repeated entries of data for those attributes.
For example, consider (12,Wing) in this example:
DeptID
12
10
12
DeptName (Other attributes)
Wing
Tail
Wing
But if the relation is BCNF, then the FD must be a key FD, and
DeptID must be a key. Thus any pair such as (12,Wing) can
appear only once.
3/14/2016
Lecture 8
Slide 23
Decomposition into BCNF
•
Here is an algorithm for decomposing an arbitrary
relation R into a collection of BCNF relations:
1. If R is not in BCNF and XA is a non-key FD, then
decompose R into R  A and XA.
2. If R  A and/or XA is not in BCNF, recursively apply
step 1.
3/14/2016
Lecture 8
Slide 24
Decomposing to BCNF*
• Given the schema
EnrollStud(StudID, ClassID, Grade, ProfID, StudName)
including its natural FDs, decompose it into BCNF relations.
3/14/2016
Lecture 8
Slide 25
LO8.5: Decomposing into BCNF*
• Given the schema
with FDs
MedsLabelDrug (Prescr#, CustID, Label, Drug) ,
Prescr#  Label, Label  Drug
decompose it into BCNF relations.
3/14/2016
Lecture 8
Slide 26
Where are we?
• We’ve accomplished a lot!
– We began with a relational schema
– We identified (redundancy, anomaly) problems with it
– We learned how to use FDs to eliminate those problems with
decompositions into BCNF.
– Along the way, we learned a powerful tool: how to determine
keys from FDs.
• There are two steps left
– Showing that the BCNF decompositions do not lose
information.
– Discovering that they may lose FDs, and how to deal with
that.
3/14/2016
Lecture 8
Slide 27
Lossless Decompositions
• Some decompositions lose information. Suppose we got carried
away and further decomposed
Enroll(StudID,ClassID,Grade) into
StudGrade(StudID, Grade) and ClassGrade(ClassID, Grade)
• Here a row (123,B) in StudGrade means that student 123 got a B
in some course, and (386,A) in ClassGrade means that some
student got an A in course 386.
• But now we have no way of knowing which student got which
grade in which class.
• This decomposition is lossy. It contains less information than the
original schema. We want to generate only lossless
decompositions when we design our databases.
3/14/2016
Lecture 8
Slide 28
Lossless Decompositions
Definition:A decomposition of a schema R with FDs F,
into attribute sets X and Y, is lossless with respect to
F if for every instance r of R that satisfies F
r = X(r) ⋈ Y(r)
In other words, we can recover r from the natural join of
the decomposed versions of r.
3/14/2016
Lecture 8
Slide 29
Example of a Lossless Decomposition
EID
A01
A12
A13
A03
R=EmpDept
Name
Ali
Eric
Eric
Tyler
DeptID
12
10
12
12
X=EID,Name,DeptID
EID
A01
A12
A13
A03
Name
Ali
Eric
Eric
Tyler
r=
3/14/2016
=r
Y=DeptID,DeptName
DeptID
12
10
12
12
EID
A01
A12
A13
A03
DeptName
Wing
Tail
Wing
Wing
Name
Ali
Eric
Eric
Tyler
Lecture 8
=X(r)
DeptID
12
10
12
12
DeptID
12
10
DeptName
Wing
Tail
Wing
Wing
DeptName
Wing
Tail
=Y(r)
= X(r) ⋈ Y(r)
Slide 30
Example of a Lossy Decomposition
R = Enroll
StudID ClassID
123
CS386
456
CS410
r
=X(r)
StudID
123
123
456
456
=r
Y =ClassID, Grade
X =StudID, Grade
StudID Grade
123
A
456
A
Grade
A
A
ClassID
CS386
CS410
CS410
CS386
ClassID Grade
CS386
A
CS410
A
Grade
A
A
A
A
=Y(r)
= X(r) ⋈ Y(r)
Note that the join has extra rows. This always happens
in lossy decompositions
3/14/2016
Lecture 8
Slide 31
Producing only Lossless Decompositions
• In our design of database schemas we certainly want
to produce only lossless decompositions. Fortunately
this is easy to guarantee.
Theorem: The decomposition of R with respect to FDs F
into attribute sets R1 and R2 is lossless if and only if
R1R2 contains a key for either R1 or R2.
Proof: Page 620 in the text.
Corollary: The BCNF decomposition algorithm produces
only lossless decompositions.
Proof: In this case F includes the FD XA and the
decomposition is into R1=R  A and R2=XA . Then
R1R2 = X is a key for XA.
3/14/2016
Lecture 8
Slide 32
Where are we?
• In CS 3/586 we have learned how to transform
A Requirements Analysis into an ER Diagram into a
Relational Schema and to transform that losslessly
into a BCNF schema.
• We recall from a previous picture that BCNF tables
are particularly simple, so this looks like a perfect
solution to a very general problem.
• But real schemas are not always BCNF. There is one
more complexity to deal with.
3/14/2016
Lecture 8
Slide 33
Dependency Preserving Decompositions
• Decompositions should preserve FDs.
• FDs are business requirements that must be enforced.
• Consider an example:
– Emp(Addr,City,State,Zip) ACS  Z, Z  S
– Keys are ACS and ACZ. Consider the BCNF decomposition:
(Address, City, Zip) ( Zip,State)
– This is BCNF but it does not preserve ACS  Z
– Consider the values
( 7315 SW84, Portland, 97223), ( 97223, OR),
( 7315 SW84, Portland, 00000), ( 00000, OR)
3/14/2016
Lecture 8
Slide 34
Third Normal Form
• Some schemas do not have a lossless, dependency preserving,
decomposition into BCNF schemas.
• Because of this dilemma, researchers created another normal
form called Third Normal Form (3NF), with the property that
every schema has a lossless dependency preserving
decomposition into 3NF schemas.
• A schema R with FDs F is in Third Normal Form if for every
XA in F, one of these is true:
– XA is a trivial FD (i.e., X contains A)
– XA is a key FD (i.e. X contains a key)
– A is a part of some key for R
Definition
of BCNF!
BCNF
3NF
3/14/2016
Lecture 8
Slide 35
Conclusion
• Almost all schemas in real life can be decomposed
into BCNF schemas that preserve all FDs. In this
case, life is wonderful.
• But every once in a while we get a schema like
Emp(Addr,City,State,Zip) ACS  Z, Z  S
• Recall that its keys are ACS and ACZ. There is no
decomposition into BCNF that preserves FDs!
• On the other hand, this schema is 3NF. Check it!
3/14/2016
Lecture 8
Slide 36
Conclusion, ctd.
• So in the rare case that we don’t have an ideal
decomposition ( lossless, dependency preserving,
into BCNF), rest assured that we can decompose into
3NF instead of BCNF and have lossless and
dependency preservation.
• The proof of this assertion is in section 19.6.2 .
3/14/2016
Lecture 8
Slide 37
Physical Database Design
•
Database development involves three steps
1. ER design
2. Schema refinement (normalization) and view definition
 This defines the conceptual and external schemas
3. Physical Design
 Choose indexes
 More schema refinement
 Consider denormalizing
 Vertical and horizontal decomposition
 Tuning the database and tuning queries
 Deciding how the data will be stored on disks (omitted)
3/14/2016
Lecture 8
Slide 38
Performance and the Workload
• Note that ER design and normalization are logical
concepts, while physical design is driven by
performance needs.
• First the user tells you what information (logical)
should be in the database, then s/he tells you how
efficiently the database should perform (physical).
• We'll start the physical design process by learning
how to choose indexes for a workload.
• We want to know: What Indexes might improve
performance? What algorithms would they enable?
What indexes are not useful together?
3/14/2016
Lecture 8
Slide 39
Example
SELECT C.commname, I.donorname
FROM comm C JOIN indiv I USING commid
WHERE I.amount>1000 AND C.party='IND';
• A B+ tree index on amount enables an index retrieval of tuples
satisfying I.amount >1000.
– But if there are many such tuples (the index is not selective) it may need
to be clustered.
• An index on party enables an index retrieval of tuples satisfying
C.party='IND'.
– Again, selectivity matters.
• An index on C.commid or I.commid
– enables an Index Nested Loop Join, but it might not be efficient if there
are many tuples in the outer table.
– Speeds up a Merge Sort join if one or both indexes are clustered
• Given an index on C.commid, an index on C.party is not useful and
similarly for indexes on I.commid and I.amount.
3/14/2016
Lecture 8
Slide 40
Too many indexes
• Why not declare all useful indexes?
1. The optimizer may not be able to support the plans
you have in mind
 Get to know your optimizer – use EXPLAIN
2. Indexes take up space
 Though nowadays this is not a big problem
3. Indexes slow updates
4. Some indexes are not useful together
5. The optimizer will be slower because it has more
choices
6. Indexes take time to create
3/14/2016
Lecture 8
Slide 41
Choosing indexes in the real world
• As illustrated on the previous two pages, choosing
indexes is an extremely complex task.
• The big 3 commercial DBMSs provide utilities to do
the work for you
– Microsoft: AutoAdmin
– DB2: Autonomic Computing
– Oracle: Automatic Database Diagnostic Monitor
3/14/2016
Lecture 8
Slide 42
Automated Index Selection
• An algorithm for choosing indexes:
– Input: schema, workload, performance requirements
– Output: An index configuration whose cost (to execute the
workload) is minimal.
– Complexity: For a single table with 10 attributes, there are
30,240 different 5-attribute indexes.
• How do we choose among all those possibilities?
– Consider only single- or two- attribute indexes.
– Consider indexes only on relevant attributes
– Still need to prune search space intelligently
• Computing the cost of a workload is very expensive – why?
3/14/2016
Lecture 8
Slide 43
More Schema Refinement
• We have studied one kind of schema refinement,
namely normalizing a schema by decomposing it into
3NF or BCNF schemas. This is part of logical
design.
• Physical design, driven by performance goals,
includes other types of schema refinement, which we
will study now. These include de-normalization (!),
vertical decomposition and horizontal decomposition.
3/14/2016
Lecture 8
Slide 44
De-Normalization
• Recall the relation
– CustState(CustID, Address, City, Zip, State)
Here is its BCNF decomposition/Normalization
– Cust(CustID,Address, City, Zip)
State(Zip,State)
• Suppose we have done the normalization and the
query
SELECT C.CustID,C.Address, C.City, C.Zip,
S.State
FROM Cust C, State S
WHERE C.Zip = S.Zip;
is a frequent and important query in the company.
3/14/2016
Lecture 8
Slide 45
De-Normalization, ctd.
• The join query will be expensive, even if we declare
indexes (which will be costly too).
• A possible solution is to denormalize the tables back
to CustState.
– Then the previous query will run much more quickly
• What are the disadvantages of denormalization?
– Space wasted
• But space is cheap nowadays
– Anomalies when data changes
• But zip codes and states are unlikely to change
• In real shops, denormalization is done to improve
performance, even when data is likely to change.
3/14/2016
Lecture 8
Slide 46
Vertical Decomposition
• Consider the BCNF relation
Emp(EID, Address, City, State, Wage, DeptID)
• Suppose that the HR department issues queries about EID, Address,
City and State and the rest of the company issues queries about EID,
Wage and DeptID.
• What is the advantage of storing the Emp information in these two
relations?
EmpHR(EID, Address, City, State)
EmpComp(EID, Wage, DeptID)
– All the queries will run faster because they process smaller tables.
• For obvious reasons this is called a vertical decomposition
3/14/2016
Lecture 8
Slide 47
Horizontal Decomposition
• Consider again the relation
Emp(EID, Address, City, State, Wage, DeptID)
• Now suppose that most Emp queries are from the
Washington or Oregon branches of the company, who
issue queries about Washington or Oregon employees,
respectively.
• Surely you see the advantage of storing the Emp
information in two relations, EmpOR and EmpWA,
consisting of OR and WA employees, respectively.
• Why is this called a horizontal decomposition?
3/14/2016
Lecture 8
Slide 48
Masking Decompositions with Views
• If someone in the company wants to issue a query
about the old Emp relation, or if there is old software
that uses the Emp relation, this is possible with the
use of a view, for example
CREATE VIEW Emp AS
SELECT * FROM EmpOR
UNION
SELECT * FROM EmpWA;
3/14/2016
Lecture 8
Slide 49
Tuning the Database
•
•
•
We have described the steps a DBA takes during initial physical design
of a database, driven by performance requirements: choosing indexes,
denormalization, and physical storage and refining schemas.
These steps continue throughout the life of a database, because
everything about the database changes: queries and their importance,
schemas, and data.
Changing the design of a database during the life of a database is
called tuning.
– Tuning also involves other steps such as updating statistics and reclustering
tables.
•
Tuning is driven by two kinds of information
– Utilities that generate performance statistics
• E.g., disk usage, response times
– User complaints
•
Hopefully utilties will warn the DBA of problems before users complain.
3/14/2016
Lecture 8
Slide 50
Tuning Queries
• Sometimes a utility or a customer will identify a specific query as
a problem (poor respose time and/or excessive use of
resources). What should you do?
• The first step: is it the fault of the DBMS?
– Check to see how much time/resources the DBMS is using vs the
network, the OS, etc.
• The next step is to use EXPLAIN/SHOW PLAN, etc to find out
what plan the optimizer is using to execute the query, then tune
the query.
• There are various techniques to tune queries:
– Rewrite the query to use existing indexes
– Simplify the query, e.g., by eliminating DISTINCT, GROUP
BY/HAVING clauses, or eliminating temporary relations
– Flatten nested queries (already studied)
– Alter the index configuration (already studied)
3/14/2016
Lecture 8
Slide 51
Rewriting a query to use existing indexes
• Consider the query
SELECT E.EID
FROM Emp E
WHERE E.salary=1000 OR E.age=25;
• Suppose there are selective indexes on salary and age, but the
optimizer is scanning the entire table.
• You could rewrite the query as a UNION
SELECT E.EID
FROM Emp E, Dept D
WHERE E.salary=1000
UNION
SELECT E.EID
FROM Emp E, Dept D
WHERE E.age=25;
3/14/2016
Lecture 8
Slide 52
Practice: Simplifying Queries*
• Can you simplify these queries?
SELECT DISTINCT E.EID
FROM Emp E
WHERE E.salary > 1000;
3/14/2016
Lecture 8
SELECT AVG(E.salary)
FROM Emp E
WHERE E.salary > 1000
GROUP BY E.age
HAVING E.age=25;
Slide 53
Practice: Eliminate temp relations
• Usually (not always) an optimizer is more efficient without
temporary relations. Can you combine these into one query?
SELECT E.sal, D.dno INTO Temp
FROM Emp E, Dept D
WHERE E.dno=D.dno
AND D.mgrname=‘Joe’;
3/14/2016
Lecture 8
SELECT T.dno, AVG(T.sal)
FROM Temp T
GROUP BY T.dno;
Slide 54
LO8.1:Exercise*
• Identify anomalies associated with this schema.
Include update, insertion and deletion anomalies.
Assigned(EmpID, JobID, EmpName, percent)
• Why do these anomalies occur?
3/14/2016
Lecture 8
Slide 55
LO8.2: Exercise*
• Identify some possible keys based on this instance.
Include only keys with one or two attributes:
T
s
t
u
s
r
3/14/2016
W
A
X
Z
A
X
X
1
5
9
2
1
Lecture 8
Y
B
X
Z
B
B
Z
2
4
2
1
2
Slide 56
LO8.3: EXERCISE*
• Identify two possible non-key FDs based on this
instance (identical to the previous slide):
T
s
t
u
s
r
3/14/2016
W
A
X
Z
A
X
X
1
5
9
2
1
Lecture 8
Y
B
X
Z
B
B
Z
2
4
2
1
2
Slide 57
LO8.4: EXERCISE*
• Given the schema R(A,B,C,D,E) ABD, CDAE
• What are all the keys of this schema?
3/14/2016
Lecture 8
Slide 58
LO8.5: EXERCISE*
• Given the schema
LoansBC(Branch#, Loan#, Amt, Assets, Cust#, CustName)
including the FDs Branch#Assets, Cust#CustName,
decompose it into BCNF relations.
3/14/2016
Lecture 8
Slide 59
Download