Lecture 11 Questions? Friday, February 3 CS 470 Operating Systems - Lecture 11

advertisement
Lecture 11

Questions?
Friday, February 3
CS 470 Operating Systems - Lecture 11
1
Outline

Atomic transactions

Log-based recovery

Serializability

Concurrency control

Locking

Timestamping
Friday, February 3
CS 470 Operating Systems - Lecture 11
2
Atomic Transactions



Implementing critical sections (CS) ensures
mutual exclusion (ME) so that when two or
more processes are executed concurrently, the
result will appear as some relative sequential
ordering of each CS.
For many applications this is sufficient. E.g.,
the Producer/Consumer solution is only
protecting an increment/decrement of one
shared integer variable.
What happens if an operation in the CS fails?
Friday, February 3
CS 470 Operating Systems - Lecture 11
3
Atomic Transactions


Sometimes a stronger guarantee is needed:
either all of the operations in a CS succeed, or
none of the operations in a CS succeed.
For example, suppose we want to do an ATM
transfer of $100 from savings to checking
consists of two steps:
1. Debit $100 from savings
2. Credit $100 to checking

We suppose that an ATM contacts a server to
complete the transaction.
Friday, February 3
CS 470 Operating Systems - Lecture 11
4
Atomic Transactions



Suppose the ATM makes a request to the
server to do Step 1, gets an acknowledgement
of success, and then crashes?
What if server crashes after receiving the ATM
request for Step 1, but comes back up before
the ATM makes the Step 2 request?
Suppose that two accounts are on separate
servers. What happens if Step 1 is successful,
but other server crashes during Step 2?
Friday, February 3
CS 470 Operating Systems - Lecture 11
5
System Model


The major concern is failure within a system.
Note that this happens even if there is only one
process running. We will look at that case first,
then look at concurrent tasks.
A collection of operations that form one logical
operation is called a transaction. These
logical operations are either reads (access
only) or writes (update), and end in either a
commit (all physical operations succeed and
cannot be undone) or an abort (all partial
changes are rolled back).
Friday, February 3
CS 470 Operating Systems - Lecture 11
6
System Model

First look at various storage types and their
relevancy to failure:



Volatile storage: L1/L2 cache, RAM. Very fast, but
almost always lost when system crashes.
Non-volatile storage: NVRAM, but also disks, tape,
etc. Usually survives system crashes, but also has
media crashes. Slower than volatile storage.
Stable storage: extremely high probability that it
never loses data. Approximated by replicating
information across several non-volatile media (e.g.,
disks) with independent failure modes and updated
in a controlled manner.
Friday, February 3
CS 470 Operating Systems - Lecture 11
7
Log-Based Recovery


Obviously, need to keep all information on
stable storage, so that it survives system
failures. First look at ensuring atomic
transactions with only one transaction running
when only volatile storage is lost.
Log-based recovery is a common method. In
addition to the data, a log is kept on stable
storage. Each log record contains information
about a transaction and is written before the
actual action takes place. Often called a writeahead log.
Friday, February 3
CS 470 Operating Systems - Lecture 11
8
Log-Based Recovery

Possible log records for a transaction T i are:

<Ti, start> - written when Ti starts

<Ti, commit> - written when Ti is completely finished


<Ti, abort> - sometimes written when Ti is unable to
finish
<Ti, itemName, oldValue, newValue> - written when
Ti modifies itemName
Friday, February 3
CS 470 Operating Systems - Lecture 11
9
Log-Based Recovery


Since the log records are written before any
actual write operations takes place, the log can
be used to reconstruct the state of a data item.
Note: the price for this ability is the system is
inherently slower. Two physical writes (the log
and the item) for every logical write. Also, the
system needs more space.
Friday, February 3
CS 470 Operating Systems - Lecture 11
10
Log-Based Recovery

When the system recovers from a crash, it uses
the log to recover. The algorithm has two
operations:



Undo (Ti): restore values of all updates by Ti to old
values
Redo (Ti): set values of all updates by Ti to new
values.
These operations must be idempotent,
meaning that multiple executions have the
same result as one, in case there is a failure
during recovery.
Friday, February 3
CS 470 Operating Systems - Lecture 11
11
Log-Based Recovery
T 0,
start


T0, D0
V0old,
V0new
T0, D1,
V1old,
V1new
T 0,
commit
T 1,
start
T 1, D 0,
V0old,
V0new
CRASH!
On abort, system calls Undo(Ti)
After a crash, which operations are undone and
which are redone?


If log contains <Ti, start>, but no <Ti, commit>, must
be undone, so call Undo(Ti)
If log contains <Ti, start> and <Ti, commit>, must be
redone, so call Redo(Ti)
Friday, February 3
CS 470 Operating Systems - Lecture 11
12
Checkpoints


Logically, when the system crashes, the entire
log is needed to recover. This is slow, and
most of the writes are reflected in stable
storage already.
Periodically perform a checkpoint.

Make sure all current log records in volatile storage
have been flushed to stable storage

Flush all volatile data to stable storage

Write <checkpoint> log record to stable storage
Friday, February 3
CS 470 Operating Systems - Lecture 11
13
Checkpoints

If <Ti, commit> appears before a checkpoint
record, Ti does not need to be redone. Refine
the recovery algorithm to:



Find Ti, the most recent transaction that started
before the most recent checkpoint record
Apply Undo and Redo as before to Ti and those
starting after it in the log
Also can discard the entries before Ti, i.e.
prune the log.
Friday, February 3
CS 470 Operating Systems - Lecture 11
14
Serializability


Now consider what happens if more than one
atomic transaction occurs concurrently. We
could execute each one in a CS, but this is too
restrictive. Often the operations can be
"overlapped" and still maintain a correct result.
Correctness is defined by serializability:
concurrent transactions must appear as if they
had executed in some serial order.
Friday, February 3
CS 470 Operating Systems - Lecture 11
15
Serializability

A serial schedule
is one in which
each transaction
executes in its
entirety before
another one
executes. For
example for two
transactions, T0
and T1:
Friday, February 3
T0
read(A)
write(A)
read(B)
write(B)
CS 470 Operating Systems - Lecture 11
T1
read(A)
write(A)
read(B)
write(B)
16
Serializability


For n concurrent transactions, there are n!
possible serial schedules. Why?
If we allow transactions to "overlap", i.e. their
operations are interleaved, we get a non-serial
schedule, but some such schedules give the
same result as one of the serial schedules.
Friday, February 3
CS 470 Operating Systems - Lecture 11
17
Serializability


Operations Oi and Oj are said
to be conflicting operations,
if both access the same data
and at least one is a write
operation.
In the example to the right,
T1:read(A) conflicts with
T0:write(A), but T0:read(B)
does not conflict with
T1:write(A).
Friday, February 3
CS 470 Operating Systems - Lecture 11
T0
T1
read(A)
write(A)
read(B)
write(B)
read(A)
write(A)
read(B)
write(B)
18
Serializability


If operations Oi and Oj are consecutive
operations of different transactions in a
schedule and do not conflict, we can swap their
order to produce a new schedule. E.g., in
previous example, can swap T1:write(A) with
T0:read(B)
To prove a non-serial schedule is correct (i.e.,
ensures serializability), show that you can swap
non-conflicting consecutive operations back to
a serial schedule.
Friday, February 3
CS 470 Operating Systems - Lecture 11
19
Concurrency Control


Introduce the concept of concurrency control:
rules of access that allow concurrent execution
when possible, but ensures serializability.
Two major protocols:


Locking - transaction must lock an object before
access
Timestamping - access to objects must be
consistent with a predetermined serial order
Friday, February 3
CS 470 Operating Systems - Lecture 11
20
Locking


Sort of like the Readers/Writers problem. Two
types of locks, one of each for each object:
shared (S) and exclusive (E)
Must request a lock before access



If object is not locked, lock is granted
If object is locked S and request is S, lock is
granted
If object is locked E, or locked S and request is E,
transaction must wait for release of lock
Friday, February 3
CS 470 Operating Systems - Lecture 11
21
Locking



To ensure serializability, use two phases:

grow phase in which locks are acquired

shrink phase in which locks are released
To do otherwise can lead to non-serializable
schedules. I.e., another transaction can see an
intermediate result.
Protocol can lead to deadlock, but is a relatively
efficient algorithm
Friday, February 3
CS 470 Operating Systems - Lecture 11
22
Timestamping



The locking protocol determines the correct
serial order with respect to a object at execution
time when the first lock is requested.
The timestamping mechanism chooses an
order in advance by assigning a unique
timestamp (TS(i)) to each transaction T i as it
enters the system.
The timestamps have the property that if T i
entered before Tj, then TS(i) < TS(j)
Friday, February 3
CS 470 Operating Systems - Lecture 11
23
Timestamping

The system ensures access consistent with the
implied serial schedule by associating two
timestamp values to each object, O:



W-ts (O): the largest timestamp of any transaction
that has successfully executed write(O)
R-ts (O): the largest timestamp of any transaction
that has successfully executed read(O)
These timestamps are updated whenever a
new read or write occurs.
Friday, February 3
CS 470 Operating Systems - Lecture 11
24
Timestamping

The protocol to ensure that any conflicting
reads and writes are executed in timestamp
order is:

When Ti issues read(O)


If TS(Ti) < W-ts(O), then Ti needs to read a value of O
this is already overwritten. The operation is rejected and
Ti is rolled back
If TS(Ti) >= W-ts(O), then the operation is executed and
R-ts(O) is set to max(R-ts(O), TS(Ti))
Friday, February 3
CS 470 Operating Systems - Lecture 11
25
Timestamping

When Ti issues write(O)




If TS(Ti) < R-ts(O), then Ti is producing a value that
should have been read by a previous read. The
operation is rejected and Ti is rolled back.
If TS(Ti) < W-ts(O), then Ti is producing a value that is
obsolete. The operation is rejected and T i is rolled back.
Otherwise, the write is executed and W-ts(O) is set to
TS(Ti)
A transaction that is rolled back is assigned a
new timestamp and restarted.
Friday, February 3
CS 470 Operating Systems - Lecture 11
26
Timestamping


The timestamping protocol is sometime called
optimistic, because it tends to allow more
possible correct non-serial schedules than
locking, which is sometimes called
pessimistic. But both can produce schedules
that the other cannot.
It also cannot produce a deadlock, since no
transaction ever waits. However, there still
could be starvation.
Friday, February 3
CS 470 Operating Systems - Lecture 11
27
Download