Assignment 6

advertisement
SE305 Database System Technology
Assignment 6
Due to Nov. 5th, 2015
1. (15’)RAID systems typically allow you to replace failed disks without stopping
access to the system. Thus, the data in the failed disk must be rebuilt and written
to the replacement disk while the system is in operation. Which of the RAID
levels yields the least amount of interference between the rebuild and ongoing
disk access? Explain your answer.
2. (10’)List the two advantages and two disadvantages of each of the following
strategies for shooting a relational database:
a) Store each relation in one file.
b) Store multiple relations (perhaps even the entire database) in one file.
3. (10’)What is the difference between a clustering index and a secondary index?
4. (20’)Suppose that we are using extendable hashing on a file that contains records
with the following search-key values:
2, 3, 5, 7, 11, 19, 23, 29, 31
Show the extendable hash structure for this file if the hash function is h(x) = 𝑥
mod 8 and buckets can hold three records.
5.
(15’)Suppose there is a relation R(A, B, C), with B+ tree index with search key
(A, B).
a. What is the worst case cost of finding records satisfying 10 < A < 50 using
this index, in terms of the number of records retrieved n1 and the height h of
the tree?
b. What is the worst case cost of finding records satisfying 10 < A < 50 ∧ 5 <
B < 10 using this index, in terms of the number of records n2 that satisfy this
selection, as well as n1 and h defined above.
c. Under what condition on n1 and n2 would the index be an efficient way of
finding records satisfying 10 < A < 50 ∧ 5 < B < 10.
6. (20‘)Answer the following questions of the scenario: a file with 2,000,000 blocks
and 17 available buffer blocks.
a. How many runs will you produce in the first pass?
b. How many passes will it take to sort the file completely?
c. What is the total I/O cost of sorting the file?
d. How many buffer blocks do you need to sort the file completely in just two
passes?
7.
(15’)Suppose that a B+-tree index on branch_city is available on relation branch,
and that no other index is available. What would be the best way to handle the
following selections that involve negation?
a. σ¬(𝑏𝑟𝑎𝑛𝑐ℎ_𝑐𝑖𝑡𝑦 < “Brooklyn”)(𝑏𝑟𝑎𝑛𝑐ℎ )
b. σ¬(𝑏𝑟𝑎𝑛𝑐ℎ_𝑐𝑖𝑡𝑦 = “Brooklyn”)(𝑏𝑟𝑎𝑛𝑐ℎ)
c. σ¬(𝑏𝑟𝑎𝑛𝑐ℎ_𝑐𝑖𝑡𝑦 < “Brooklyn” ∨ 𝑎𝑠𝑠𝑒𝑡𝑠 < 5000)(𝑏𝑟𝑎𝑛𝑐ℎ )
8.
(15’)Consider the following two transactions:
T1: Read(A);
Read(B);
If A = 0 then B := B + 1;
Write(B).
T2: Read(B);
Read(A);
If B = 0 then A := A + 1;
Write(A)
Let the consistency requirement be A = 0 ∨B = 0, with A = B = 0 the initial
value.
a. Show that every serial execution involving these two transactions preserves
the consistency of the database.
b. Show a concurrent execution of T1 and T2 that produces a nonserializable
schedule.
c. Is there a concurrent execution of T1 and T2 that produces a serializable
schedule?
Download