CS 405G: Introduction to Database Systems 13b Exercise Chen Qian

advertisement

CS 405G: Introduction to

Database Systems

13b Exercise

Chen Qian

University of Kentucky

2 nd Normal Form

 Note about 2 nd Normal Form

 by definition, every nonprimary attribute is functionally dependent on every key of R

In other words, R is in its 2 nd normal form if we could not find a partial dependency of a nonprimary key to a key in

R .

2 4/13/2020 Chen Qian @ University of Kentucky

Review

 WorkOn ( EID , Ename , email , PID , hour )

 We say X -> Y is a partial dependency if there exist a X’

X such that X’ -> Y

 e.g. EID, PID -> Ename

 Otherwise , X -> Y is a full dependency

 e.g. EID, PID -> hours

EID

1234

1123

1234

1023

PID

10

9

9

10

Ename email Pname Hours

John Smith jsmith@ac.com

B2B platform 10

Ben Liu bliu@ac.com

CRM 40

John Smith jsmith@ac.com

CRM 30

Susan Sidhuk ssidhuk@ac.com

B2B platform 40

4/13/2020 Chen Qian @ University of Kentucky 3

Third normal form

• 3NF requires that there are no non-trivial functional dependencies of non-key attributes on something other than a superset of a candidate key.

• Recall: non-trivial FD means LHS has no intersection with RHS.

• In summary, all non-key attributes are mutually independent.

4/13/2020 Chen Qian @ University of Kentucky 4

Boyce-Codd normal form (BCNF)

• BCNF requires that there are no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).

• All attributes are dependent on a key, a whole key and nothing but a key (excluding trivial dependencies, like A->A).

4/13/2020 Chen Qian @ University of Kentucky 5

Quiz 4

Q1 (6pt): Consider the following relation:

Explain why it is not in 1NF (3pt). Convert it to multiple tables that satisfy 1NF and 2NF. (3pt)

4/13/2020 Chen Qian @ University of Kentucky 6

Quiz 4

 Because EmpDegrees is not atomic.

4/13/2020 Chen Qian @ University of Kentucky 7

Quiz 4

 Question 2 (4pt) Consider the following relation and its functional dependencies:

Is it in 3NF? (1pt) Is it in BCNF? (2pt) Explain why (1pt)

4/13/2020 Chen Qian @ University of Kentucky 8

Quiz 4

 It is in 3NF

 but not in BCNF

Student_no, course_no

 instr_no

Instr_no

 course_no

 Because the determinant (LHS) of the FD Intr_no -> course_no is not a key attribute, so it is not in BCNF.

 However, since the RHS is a key attribute, it does not violate 3NF.

4/13/2020 Chen Qian @ University of Kentucky 9

Exercise: Functional dependency

 Suppose you are given a relation R with four attributes ABCD.

For of the following sets of FDs, assuming those are the only dependencies that hold for R, do the following: (a) Identify the candidate key(s) for R. (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).

Candidate keys: B

R is in 2NF but not 3NF.

4/13/2020 Chen Qian @ University of Kentucky 10

Exercise: Functional dependency

 Suppose you are given a relation R with four attributes ABCD.

For of the following sets of FDs, assuming those are the only dependencies that hold for R, do the following: (a) Identify the candidate key(s) for R. (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).

Candidate keys: BD

R is in 1NF but not 2NF.

4/13/2020 Chen Qian @ University of Kentucky 11

Exercise: Functional dependency

 Suppose you are given a relation R with four attributes ABCD.

For of the following sets of FDs, assuming those are the only dependencies that hold for R, do the following: (a) Identify the candidate key(s) for R. (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).

3. ABC → D, D → A

Candidate keys: ABC , BCD

R is in 3NF but not BCNF

4/13/2020 Chen Qian @ University of Kentucky 12

Exercise: Functional dependency

 Suppose you are given a relation R with four attributes ABCD.

For of the following sets of FDs, assuming those are the only dependencies that hold for R, do the following: (a) Identify the candidate key(s) for R. (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).

4. A → B, BC → D, A → C

Candidate keys: A

R is in 2NF but not 3NF

(because of the FD: BC → D).

4/13/2020 Chen Qian @ University of Kentucky 13

Exercise: Functional dependency

 Suppose you are given a relation R with four attributes ABCD.

For of the following sets of FDs, assuming those are the only dependencies that hold for R, do the following: (a) Identify the candidate key(s) for R. (b) Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).

5. AB → C, AB → D, C → A, D → B

(a) Candidate keys: AB , BC , CD , AD

R is in 3NF but not BCNF

(because of the FD: C → A).

4/13/2020 Chen Qian @ University of Kentucky 14

Exercise: Concurrency control

Consider the following actions taken by transaction T 1 on database objects X and Y :

R(X), W(X), R(Y), W(Y)

1. Give an example of another transaction T 2 that, if run concurrently to transaction T without some form of concurrency control, could interfere with T 1.

 If the transaction T2 performed W ( Y ) before T1 performed R ( Y ), and then T2 aborted, the value read by T1 would be invalid and the abort would be cascaded to T1 (i.e. T1 would also have to abort).

4/13/2020 Chen Qian @ University of Kentucky 15

Exercise: Concurrency control

Consider the following actions taken by transaction T 1 on database objects X and Y :

R(X), W(X), R(Y), W(Y)

2. Explain how the use of Strict 2PL would prevent interference between the two transactions.

 Strict 2PL would require T2 to obtain an exclusive lock on Y before writing to it. This lock would have to be held until T2 committed or aborted; this would block T1 from reading Y until

T2 was finished, thus there would be no interference.

4/13/2020 Chen Qian @ University of Kentucky 16

Exercise: disks

 Explain the terms seek time, rotational delay , and transfer time .

 1. Seek time is the time taken to move the disk heads to the track on which a desired block is located.

 2. Rotational delay is the waiting time for the desired block to rotate under the disk head; it is the time required for half a rotation on average, and is usually less than the seek time.

 3. Transfer time is the time to actually read or write the data in the block once the head is positioned, i.e., the time for the disk to rotate over the block.

4/13/2020 Chen Qian @ University of Kentucky 17

Exercise: disks

 If you have a large file that is frequently scanned sequentially, explain how you would store the pages in the file on a disk.

 A: The pages in the file should be stored ‘sequentially’ on a disk. We should put two ‘logically’ adjacent pages as close as possible. In decreasing order of closeness, they could be on the same track, the same cylinder, or an adjacent cylinder.

4/13/2020 Chen Qian @ University of Kentucky 18

Exercise: disks

 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double-sided platters, and average seek time of 10 msec.

1. What is the capacity of a track in bytes? What is the capacity of each surface? What is the capacity of the disk?

bytes/track = bytes/sector

× sectors/track = 512

×

50 =

25 K bytes/surface = bytes/track

× tracks/surface = 25 K

×

2000

= 50 , 000 K bytes/disk = bytes/surface

× surfaces/disk = 50 , 000 K

×

5

×

2 = 500 , 000 K

4/13/2020 Chen Qian @ University of Kentucky 19

Exercise: disks

 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double-sided platters, and average seek time of 10 msec.

3. If the disk platters rotate at 5400 rpm (revolutions per minute), what is the maximum rotational delay?

If the disk platters rotate at 5400rpm, the time required for one complete rotation, which is the maximum rotational delay, is

 The average rotational delay is half of the rotation time, 0.006 seconds.

4/13/2020 Chen Qian @ University of Kentucky 20

Exercise: disks

 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double-sided platters, and average seek time of 10 msec.

 4. If one track of data can be transferred per revolution, what is the transfer rate?

 The capacity of a track is 25K bytes. Since one track of data can be transferred per revolution, the data transfer rate is

4/13/2020 Chen Qian @ University of Kentucky 21

Exercise: disks

Consider … average seek time of 10 msec. suppose that a block size of 1024 bytes is chosen. Suppose that a file containing

100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks.

 5. What time is required to read a file containing 100,000 records of 100 bytes each sequentially?

A file containing 100,000 records of 100 bytes needs 40 cylinders or 400 tracks in this disk. The transfer time of one track of data is

0.011 seconds. Then it takes 400

×

0 .

011 = 4 .

4 seconds to transfer 400 tracks.

This access seeks the track 40 times. The seek time is 40

×

0 .

01

= 0 .

4 seconds . Therefore, total access time is 4 .

4+ 0 .

4 =

4 .

8 seconds .

4/13/2020 Chen Qian @ University of Kentucky 22

Exercise: disks

6. What is the time required to read a file containing 100,000 records of 100 bytes each in a random order? Assume that each block request incurs the average seek time and rotational delay.

For any block of data, averageaccesstime = seektime + rotationaldelay + transfertime

 The average access time for a block of data would be 16.44 msec.

For a file containing 100,000 records of 100 bytes, the total access time would be 164.4 seconds.

4/13/2020 Chen Qian @ University of Kentucky 23

Tree structure

Each intermediate node can hold up to five pointers and four key values. Each leaf can hold up to four records

Name all the tree nodes to be fetched to answer the following query: “Get all records with search key greater than 38.”

4/13/2020 Chen Qian @ University of Kentucky 24

Tree structure

Name all the tree nodes to be fetched to answer the following query: “Get all records with search key greater than 38.”

I1, I2, and everything in the range [L2..L8].

4/13/2020 Chen Qian @ University of Kentucky 25

Tree structure

 inserting a record with search key 109

4/13/2020 Chen Qian @ University of Kentucky 26

 inserting a record with search key 109

4/13/2020 Chen Qian @ University of Kentucky 27

Tree structure

 deleting the record with search key 81 from the original tree.

4/13/2020 Chen Qian @ University of Kentucky 28

4/13/2020 Chen Qian @ University of Kentucky 29

Tree structure

Name a search key value such that inserting it into the (original) tree would cause an increase in the height of the tree.

Anything to L4 or L5, e.g. 57

4/13/2020 Chen Qian @ University of Kentucky 30

Tree structure

 What we can infer about subtrees A, B, and C. ?

4/13/2020 Chen Qian @ University of Kentucky 31

Tree structure

 We can infer several things about subtrees A, B, and C. First of all, they each must have height one, since their “sibling” trees

(those rooted at I2 and I3) have height one. Also, we know the ranges of these trees (assuming duplicates fit on the same leaf): subtree A holds search keys less than 10, B contains keys

10 and < 20, and C has keys

20 and < 30. In addition, each intermediate node has at least 2 key values and 3 pointers.

4/13/2020 Chen Qian @ University of Kentucky 32

Tree structure

 Suppose that this is an ISAM index. What is the minimum number of insertions needed to create a chain of three overflow pages?

4/13/2020 Chen Qian @ University of Kentucky 33

Tree structure

 If this is an ISAM tree, we would have to insert at least nine search keys in order to develop an overflow chain of length three.

These keys could be any that would map to L4, L5, L7, or L8, all of which are full and thus would need overflow pages on the next insertion. The first insert to one of these pages would create the first overflow page, the fifth insert would create the second overflow page, and the ninth insert would create the third overflow page (for a total of one leaf and three overflow pages).

4/13/2020 Chen Qian @ University of Kentucky 34

 Want to present on 4/21?

 HW4 due on 4/21

4/13/2020 Chen Qian @ University of Kentucky 35

Download