Lecture 12 Questions? Monday, February 7 CS 470 Operating Systems - Lecture 12 1 Outline Atomic transactions Log-based recovery Serializability Concurrency control Locking Timestamping Monday, February 7 CS 470 Operating Systems - Lecture 12 2 Atomic Transactions Implementing critical sections (CS) ensures mutual exclusion (ME) so that when two or more processes are executed concurrently, the result will appear as some relative sequential ordering of each CS. For many applications this is sufficient. E.g., the Producer/Consumer solution is only protecting an increment/decrement of one shared integer variable. What happens if an operation in the CS fails? Monday, February 7 CS 470 Operating Systems - Lecture 12 3 Atomic Transactions Sometimes a stronger guarantee is needed: either all of the operations in a CS succeed, or none of the operations in a CS succeed. For example, suppose we want to do an ATM transfer of $100 from savings to checking consists of two steps: 1. Debit $100 from savings 2. Credit $100 to checking We suppose that an ATM contacts a server to complete the transaction. Monday, February 7 CS 470 Operating Systems - Lecture 12 4 Atomic Transactions Suppose the ATM makes a request to the server to do Step 1, gets an acknowledgement of success, and then crashes? What if server crashes after receiving the ATM request for Step 1, but comes back up before the ATM makes the Step 2 request? Suppose that two accounts are on separate servers. What happens if Step 1 is successful, but other server crashes during Step 2? Monday, February 7 CS 470 Operating Systems - Lecture 12 5 System Model The major concern is failure within a system. Note that this happens even if there is only one process running. We will look at that case first, then look at concurrent tasks. A collection of operations that form one logical operation is called a transaction. These logical operations are either reads (access only) or writes (update), and end in either a commit (all physical operations succeed and cannot be undone) or an abort (all partial changes are rolled back). Monday, February 7 CS 470 Operating Systems - Lecture 12 6 System Model First look at various storage types and their relevancy to failure: Volatile storage: L1/L2 cache, RAM. Very fast, but almost always lost when system crashes. Non-volatile storage: NVRAM, but also disks, tape, etc. Usually survives system crashes, but also has media crashes. Slower than volatile storage. Stable storage: extremely high probability that it never loses data. Approximated by replicating information across several non-volatile media (e.g., disks) with independent failure modes and updated in a controlled manner. Monday, February 7 CS 470 Operating Systems - Lecture 12 7 Log-Based Recovery Obviously, need to keep all information on stable storage, so that it survives system failures. First look at ensuring atomic transactions with only one transaction runing when only volatile storage is lost. Log-based recovery is a common method. In addition to the data, a log is kept on stable storage. Each log record contains information about a transaction and is written before the actual action takes place. Often called a writeahead log. Monday, February 7 CS 470 Operating Systems - Lecture 12 8 Log-Based Recovery Possible log records for a transaction T i are: <Ti, start> - written when Ti starts <Ti, commit> - written when Ti is completely finished <Ti, abort> - sometimes written when Ti is unable to finish <Ti, itemName, oldValue, newValue> - written when Ti modifies itemName Monday, February 7 CS 470 Operating Systems - Lecture 12 9 Log-Based Recovery Since the log records are written before any actual write operations takes place, the log can be used to reconstruct the state of a data item. Note: the price for this ability is the system is inherently slower. Two physical writes (the log and the item) for every logical write. Also, the system needs more space. Monday, February 7 CS 470 Operating Systems - Lecture 12 10 Log-Based Recovery When the system recovers from a crash, it uses the log to recover. The algorithm has two operations: Undo (Ti): restore values of all updates by Ti to old values Redo (Ti): set values of all updates by Ti to new values. These operations must be idempotent, meaning that multiple executions have the same result as one, in case there is a failure during recovery. Monday, February 7 CS 470 Operating Systems - Lecture 12 11 Log-Based Recovery T 0, start T0, D0 V0old, V0new T0, D1, V1old, V1new T 0, commit T 1, start T 1, D 0, V0old, V0new CRASH! On abort, system calls Undo(Ti) After a crash, which operations are undone and which are redone? If log contains <Ti, start>, but no <Ti, commit>, must be undone, so call Undo(Ti) If log contains <Ti, start> and <Ti, commit>, must be redone, so call Redo(Ti) Monday, February 7 CS 470 Operating Systems - Lecture 12 12 Checkpoints Logically, when the system crashes, the entire log is needed to recover. This is slow, and most of the writes are reflected in stable storage already. Periodically perform a checkpoint. Make sure all current log records in volatile storage have been flushed to stable storage Flush all volatile data to stable storage Write <checkpoint> log record to stable storage Monday, February 7 CS 470 Operating Systems - Lecture 12 13 Checkpoints If <Ti, commit> appears before a checkpoint record, Ti does not need to be redone. Refine the recovery algorithm to: Find Ti, the most recent transaction that started before the most recent checkpoint record Apply Undo and Redo as before to Ti and those starting after it in the log Also can discard the entries before Ti, i.e. prune the log. Monday, February 7 CS 470 Operating Systems - Lecture 12 14 Serializability Now consider what happens if more than one atomic transaction occurs concurrently. We could execute each one in a CS, but this is too restrictive. Often the operations can be "overlapped" and still maintain a correct result. Correctness is defined by serializability: concurrent transactions must appear as if they had executed in some serial order. Monday, February 7 CS 470 Operating Systems - Lecture 12 15 Serializability A serial schedule is one in which each transaction executes in its entirety before another one executes. For example for two transactions, T0 and T1: Monday, February 7 T0 read(A) write(A) read(B) write(B) CS 470 Operating Systems - Lecture 12 T1 read(A) write(A) read(B) write(B) 16 Serializability For n concurrent transactions, there are n! possible serial schedules. Why? If we allow transactions to "overlap", i.e. their operations are interleaved, we get a non-serial schedule, but some such schedules give the same result as one of the serial schedules. Monday, February 7 CS 470 Operating Systems - Lecture 12 17 Serializability Operations Oi and Oj are said to be conflicting operations, if both access the same data and at least one is a write operation. In the example to the right, T1:read(A) conflicts with T0:write(A), but T0:read(B) does not conflict with T1:write(A). Monday, February 7 CS 470 Operating Systems - Lecture 12 T0 T1 read(A) write(A) read(B) write(B) read(A) write(A) read(B) write(B) 18 Serializability If operations Oi and Oj are consecutive operations of different transactions in a schedule and do not conflict, we can swap their order to produce a new schedule. E.g., in previous example, can swap T1:write(A) with T0:read(B) To prove a non-serial schedule is correct (i.e., ensures serializability), show that you can swap non-conflicting consecutive operations back to a serial schedule. Monday, February 7 CS 470 Operating Systems - Lecture 12 19 Concurrency Control Introduce the concept of concurrency control: rules of access that allow concurrent execution when possible, but ensures serializability. Two major protocols: Locking - transaction must lock an object before access Timestamping - access to objects must be consistent with a predetermined serial order Monday, February 7 CS 470 Operating Systems - Lecture 12 20 Locking Sort of like the Readers/Writers problem. Two types of locks, one of each for each object: shared (S) and exclusive (E) Must request a lock before access If object is not locked, lock is granted If object is locked S and request is S, lock is granted If object is locked E, or locked S and request is E, transaction must wait for release of lock Monday, February 7 CS 470 Operating Systems - Lecture 12 21 Locking To ensure serializability, use two phases: grow phase in which locks are acquired shrink phase in which locks are released To do otherwise can lead to non-serializable schedules. I.e., another transaction can see an intermediate result. Protocol can lead to deadlock, but is a relatively efficient algorithm Monday, February 7 CS 470 Operating Systems - Lecture 12 22 Timestamping The locking protocol determines the correct serial order with respect to a object at execution time when the first lock is requested. The timestamping mechanism chooses an order in advance by assigning a unique timestamp (TS(i)) to each transaction T i as it enters the system. The timestamps have the property that if T i entered before Tj, then TS(i) < TS(j) Monday, February 7 CS 470 Operating Systems - Lecture 12 23 Timestamping The system ensures access consistent with the implied serial schedule by associating two timestamp values to each object, O: W-ts (O): the largest timestamp of any transaction that has successfully executed write(O) R-ts (O): the largest timestamp of any transaction that has successfully executed read(O) These timestamps are updated whenever a new read or write occurs. Monday, February 7 CS 470 Operating Systems - Lecture 12 24 Timestamping The protocol to ensure that any conflicting reads and writes are executed in timestamp order is: When Ti issues read(O) If TS(Ti) < W-ts(O), then Ti needs to read a value of O this is already overwritten. The operation is rejected and Ti is rolled back If TS(Ti) >= W-ts(O), then the operation is executed and R-ts(O) is set to max(R-ts(O), TS(Ti)) Monday, February 7 CS 470 Operating Systems - Lecture 12 25 Timestamping When Ti issues write(O) If TS(Ti) < R-ts(O), then Ti is producing a value that should have been read by a previous read. The operation is rejected and Ti is rolled back. If TS(Ti) < W-ts(O), then Ti is producing a value that is obsolete. The operation is rejected and T i is rolled back. Otherwise, the write is executed and W-ts(O) is set to TS(Ti) A transaction that is rolled back is assigned a new timestamp and restarted. Monday, February 7 CS 470 Operating Systems - Lecture 12 26 Timestamping The timestamping protocol is sometime called optimistic, because it tends to allow more possible correct non-serial schedules than locking, which is sometimes called pessimistic. But both can produce schedules that the other cannot. It also cannot produce a deadlock, since no transaction ever waits. However, there still could be starvation. Monday, February 7 CS 470 Operating Systems - Lecture 12 27