Exercises to Chapter 15 – Transactions

advertisement
Exercises to Chapter 15 – Transactions
15.1 List ACID properties, explain usefulness of each.
A – Atomicity – transaction has several actions, after each intermediate of them Database
may be in not consistent state, hence either all of them must be successful, either none
C – Consistency – transactions are to transit database from one consistent state to
another, preserve consistency
I – Isolation – for improving throughput, transactions may run concurrently, but each of
them must give same output as in exclusive (isolated) mode of execution
D – Durability – results of changes made by successful transaction must remain in
database, must be durable even if system will fail.
15.2. Suppose that there is a database that never fails. Is a recovery manager required for
this system?
In the case of software errors in written transaction, or due to the absence of some data,
transactions will be aborted, and they are to be rolled-back
15.3. Consider file system. What are the steps involved in creation and deletion files, in
writing data to files? Explain how the issues of atomicity and durability are relevant to
the creation and deletion of files, writing data to files?
For creation-deletion of file, there must be created/deleted directory entry for this file. In
the case of creation disk clusters are to be allocated to file, in the case of deletion,
allocated clusters are to be deallocated. Data are written to the RAM buffer, from which
they are transferred to disk in the case of overfilling, or when file is closed, or when
buffer is flushed by explicit command.
Durability is important for files – we expect that saved data will be persistent on disk.
Atomicity is not guaranteed for applications, but it must be provided for operations of
deletion and creation, otherwise, we shall lose disk space.
15.4. Database implementers paid much attention to ACID properties, but file-system
implementers have not. Why?
Because database application are crucially to be atomic and durable, violation can cause
real-world problems, hence this job for providing ACID was paid for.
15.5. List possible states through which transactions may pass.
Active=>Partially committed=> committed
Active=>Partially committed=>failed=>aborted
Active=> failed=>aborted
15.6. Why concurrent execution of transactions is important in the case of long
transactions or transactions working with (slow) disk, and not important for short
transactions?
Concurrent execution of transaction involves overhead for their management, context
switching. If transactions are short, overhead may be comparable with times of their
execution, and concurrent execution will not be beneficial. But in the case of transactions
working with disks, when one of them waits for I/O operation termination, other may use
processor. Also, in the case of mixing of long and short transaction, concurrent execution
decreases response times of short transactions, and increases throughput of the system.
15.7. Explain distinction between serial and serializable schedules.
Serial schedule assumes sequential execution of transactions. Serializable schedule is a
parallel schedule equivalent (conflict, view) to some serial schedule, i.e. providing same
outputs.
15.8. Consider the following 2 transactions:
T1: read(A)
Read(B);
If(A==0)B++;
Write(B);
T2: read(B);
Read(A);
If(B==)A++;
Write(A);
Let the consistence requirement be A==0 V B==0, with A=B=0 as initial values
a)Show that every serial execution of these transactions preserves consistency
b)Show that concurrent execution produces not-serializable schedule
c)Is there concurrent execution producing serializable schedule?
a) Serial schedules may be only 2: T1 T2 or T2 T1.
Let’s consider 1st variant T1 T2:
T1 will read initial values of A, B in memory (0,0), A will satisfy condition in if operator,
hence B will become 1, this B=1 will be written back to the disk. After execution of T1
A=0, B=1, consistency condition holds.
Then T2 will read B,A (1,0), B will not satisfy condition in if statement, hence A will not
be modified, and A=0 will be written back, not changing previous value, hence,
consistency again will be preserved.
2nd variant is treated similarly, and we get again that consistency will be preserved.
b,c) In concurrent execution at least 1 operation of one transaction must start before
termination of the other transaction. Last operation is write(B) or write(A). 1st operation
in other transaction is read (B) or write(A), which are in conflict with the 1st operation of
other transaction. Hence, any parallel schedule will not be conflict serializable. It will not
be also view serializable, because operation of modification is last operation in both
transactions, and in any parallel schedule both transactions will read initial values of A,
B, but in any sequential schedule one transaction will use results written by the previous
one.
As far as in any parallel schedule, both transactions will read initial values of A,B (0,0),
in both of them if condition will be true, and each of them will make modifications. So,
after their parallel execution both A,B will be incremented to (1,1), hence consistency
condition will be violated.
15.9. Since every conflict-serializable is also view-serializable, why do we emphasize
conflict serializability rather than view-serializability?
Because conflict-seriliazability needs in simple algorithms for its checking, while
checking of view-seriliazability belongs to NP-complete problems.
15.10. Consider precedence graph
T1
T2
T4
T3
T5
Is the corresponding schedule conflict-serializable?
Precedence graph is built according to some schedule S, and has nodes corresponding to
transactions, and 2 nodes, T1, T2 are connected by edge directed from T1 to T2, if these
transaction have pair of conflicting instructions, and conflicting instruction of that pair in
T1 must be executed according to S before respective instruction of T2. Schedule is
conflict serializable, if respective precedence hasn’t cycles. Presence of cycle means that
there are conflicting instructions which are to be executed at first in each transaction
involved in cycle. This means that in any sequential schedule required sequence of such
conflicting instructions will be violated. Our precedence graph has not cycles, so
respective schedule is conflict serializable. For finding serial schedule, conflictequivalent to schedule represented by such precedence graph, we are to determine
sequence of execution of transactions complying to the precedence graph. We can’t take
as 1st transaction to be executed T5 since conflicting instructions of T4, T3 are to be
executed before it. Considering our graph, we come to conclusion that only T1 may be
chosen as the 1st transaction for execution, as not having predecessors. Similarly, 2nd may
be only T2, having T1 as predecessor. Next may be chosen either T3, either T4, each of
them is to be executed both after T1, T2. So, our possible schedules, will be either T1,
T2, T3, T4, T5, either T1, T2, T4, T3, T5, each of corresponds to the topological order
between nodes represented by precedence graph.
For machine processing, graph may be represented by incidence matrix:
1
2
3
4
5
1
1
1
1
2
1
1
3
1
4
1
5
Having n rows, n columns, n is a number of nodes, ij-th element is 1 if there is edge
directed from node Ti to node Tj, otherwise element is 0, 0 elements we haven’t shown
(empty cells). Number of 1-s is equal to number of edges. To find successor of some
node we are to examine respective row, for example, successors of T1 are T2, T3, T4,
because 1-s in the 1-st row are in columns 2,3,4. Similarly, predecessors of any node may
be found by analysis of respective column, for example, T3 has 2 predecessors T1, T2
because in the 3rd column we have 1-s in rows 1,2. To find node, not having predecessors,
we are to find column with all zeroes, this will be only 1st column, hence, 1st task in serial
schedule will be T1. Then we zero 1st row, and look for new zero columns, only 2nd
column will be zeroed, hence T2, will be 2nd in the schedule. We zero 2nd row, and
columns 3, 4 become zeroed, hence either T3, either T4 may be chosen next, for example,
T4. We zero its row, no new zeroed columns will appear, so next will be T3. After
zeroing 3rd row, 5th column will become zeroed, and T5 will be the last task in our
schedule. This procedure gives also the way for checking presence of loops: if after our
procedure all matrix will be zeroed, then there are no loops, otherwise there will be 1-s
corresponding to nodes involved in loops. Complexity of this procedure I O(n2).
15.11. What is a recoverable schedule? Why recoverability is desirable? Are there any
circumstances in which it would be desirable to allow non-recoverable schedules?
Recoverability assumes ability to recover previous consistent state of database in the case
of transaction failure, so it is important feature. For maintenance of such a feature in
concurrent environment, when multiple transactions execute simultaneously, nonrecoverable situation may occur if transaction T1 which have used results provided by
other transaction T2, will commit, but after that moment T2 will continue execution and
will fail. So, results provided by T2 were already used by T1, and since T1 has
committed this situation can’t be rolled-back. For providing recoverability, each
transaction, which uses results provided by other transactions, must not commit before all
of these transactions will commit. This may lead to large response time for transactions
which use results provided by other long transactions. In the case of such circumstances,
if probability of failure of transactions providing information for other transactions, it
may be allowed to use non-recoverable schedules.
15.12. What is a cascadeless schedule? Why is cascadelessness of schedules is desirable?
Are there any circumstances under which it would be desirable to allow non-cascadeless
schedules?
A cascadeless schedule is one in which failure of transaction results of which are to be
used in other transactions, will not lead to necessity of rolling-back latter transactions.
This is achieved by allowing to read results provided by some transaction, only after
commitment of the latter. This restricts concurrency of execution of transactions, and if
probability of transactions failure is small, it may be desirable to allow non-cascadeless
schedules.
Download