Final

advertisement
COSC 671
Final WINTER 2014
Name:
NOTE: YOU MUST ANSWER 6, 7, 8. Questions 1 – 5 are optional and count for extra
credit only.
Open books, open notes, open Internet.
Return your answer electronically (good formats are: pdf, odt, sxw, rtf, txt, doc, png, jpg.
1. For each of the following schedules, state (1) whether the schedule is serializable and
(2) whether the schedule is conflict serializable.
r: read
w: write
a: abort
c: commit
a. r (T1,x), r(T2,x), w(T1,x), w(T2,x), c(T2), c(T1)
b. r(T1,x), r(T2,x), r(T2,y), w(T2,y), c(T2), w(T1,y), c(T1)
2. Give the precedence graph for each of the schedules given in #1. The graph can be
given as a picture (nodes and arcs), adjacency matrix or adjacency list.
3. For each of the following sets of transactions, give a schedule that maximizes
interleaving while avoiding conflicts by using the 2PL protocol (there may be more than
one correct answer, give only one answer for each).
a.
T1: r(x), w(x), c
T2: r(x), w(x), c
b.
T1: r(x), r(y), c
T2: r(x), w(x), c, r(y), w(y), c
4. For the following schedule (there is only one schedule given in the problem), give the
schedule that results from using timestamping with wait-die.
r(T1,x), r(T2,x), r(T2,y), w(T2,y), r(T3,x), w(T3,x),
c(T3), c(T2), w(T1,y), c(T1)
5. A database has two tables, X and Y. X has schema (a, b), and the current instance has
tuples x1, x2, … xm. Y has schema (c, d), and the current instance has tuples y1, y2, …
yn.
Recall the multi-granularity locking protocol described in class (see also
http://en.wikipedia.org/wiki/Multiple_granularity_locking ). The database objects are
hierarchically shown here.
DB
Y
X
x1
a
b
…
x2
a
b
y1
c
d
y2
…
c
d
Consider the following schedules, give all locks over the database hierarchy at the end of
the operations from each schedule (two separate answers).
a. r(T1,x1.a), r(T2, y1.c)
b. r(T1, x1.a), w(T1, x1.a), r(T2, x2.b)
6. Suppose you have two interleaved transactions, T1, T2.
T1: r(x), r(y), x += 1, y += 1, w(x), w(y), c
T2: r(x), x += 10, w(x), c
The trace of actual execution is as follows (the subscript gives the transaction #)
r1(x), r2(x) r1(y), x1+=1, y1+=1, w1(x), x2+=10, FAIL!
Starting values: x = 5, y = 50
At the point of FAIL! The entire DB system fails; a few minutes later, recovery starts and
then recovery will successfully complete.
(a) Give the log file just before the FAIL! occurs.
(b) Give the values of x and y after recovery completes.
(c) We can suppose the DB system failed because of some error in data protection
(that is, in implementing correct locking). Identify the statement that should not
have been allowed to execute.
(d) Continue with (c), how would 2PL prevent the statement identified in (c) from
causing the DB failure?
7. In a distributed data base system, suppose tables R1, R2, R3 are created at node1.
Then R2 and R3 are migrated to node2. Give the distributed catalog information (global
catalog and local catalogs) that reflects the current distribution of R1, R2, and R3.
8. This is a ‘thinking’ question. We did not discuss this in class. You need to consider the
proposal and discuss advantages/disadvantages of each (if any). I expect approximately
one page of text/figures to present your basic opinions.
In a distributed data base system, the table R1 is horizontally partitioned (into R1, R2)
between node1 and node2: R1 is put on node1, R2 is put on node2. You also have a
secondary index, I, on R1, attribute A.
Consider three possibilities:
1. I is stored on node1 and replicated on node 2.
2. I is stored on node 3,
3. I is partitioned into I1 and I2, where I1 is the secondary index on R1.A and I2 is the
secondary index on R2.A.
A query wishes to do a join using R.A as one of the join attribute.
Which is the best proposal, and what are the advantages and disadvantages compare to
the other two proposals? Consider storage requirements, communication costs, amount of
parallelism, processing time.
Download