Uniprocessor Checkpointing CS 717 – Fall 2001 9/25/01

advertisement
Uniprocessor Checkpointing
CS 717 – Fall 2001
9/25/01
The Need to Save State



Many of the FT systems we have discussed
need a way to restart processes from
previous points in their computation
A checkpoint is just a ‘snapshot’ of a process
(or system) at a certain point in time
A checkpointing system provides a way to
take these snapshots, and to restart from
them
Types of Ckpt Systems

Kernel Level



OS supports ckpt & recovery
Transparent to the application and developer
User Level

Application linked against (user) library




Library functions perform ckpt and recovery
Transparent to application
Limitations (cannot restore PID, PPID, etc.)
Application Level

Applications coded to ckpt themselves, and to
restart from a checkpoint
Comparison of Levels

Kernel & User (System) Level




Easy to add checkpointing to existing code
Works with (almost) any programs
General, ‘coarse’, approach
Application Level


Could require complete re-write, or
extensive modifications
Specific, ‘fine-grained’ solutions
System Level Checkpointing

Libckpt (1994)

Plank, Beck, Kingsley (UTK), Li (Princeton)

User level library for UNIX
Libckpt


User Level Checkpoint Library
Goals

Transparent


Requires minimal modifications to code and rere-linking
Low Overhead


Automatic optimizations to reduce ckpt file size
Allow user directed checkpointing
Libckpt Overview

Taking the ‘snapshot’



Suspend the process
Write process’ memory and registers to a
file
Recovery


Reload executable from original file
Reconstruct memory and register state
from checkpoint file
Libckpt Operation


Application main() is re-named
ckpt_target()
Library main() checks if in restore
mode (specified using command line
option); otherwise reads checkpoint
parameters from file
Libckpt Operation (2)


main() sets a timer to interrupt
application every n seconds
On signal



Uses setjmp to record registers, pc, etc.
Writes the stack and heap segments to file
Resumes application code
Libckpt Operation






If application started with =recover as
command line option
Application begins, recovering Text
segments
Open checkpoint file
Recover heap from file
Recover stack from file
Restores register file (using longjmp)
Virtual Address Space
Bottom of Stack
Stack
SP
sbrk(0)
Heap
&edata
&etext
0
Data (Static)
Text
Checkpoint And Recovery
Algorithms
main()
if(recovery)
restore stack
restore heap
pos = top of stack
longjmp(pos, 1)
// restore regs.
else
run usual code
signal_handler()
jmp_buf pos
if(setjmp(pos)==0)
//saved reg. in known
//position on stack
write stack
write heap
else
// process recovered
return
Illustration
main()
user_main()
fun1()
fun2()
signal
save regs on stack
save stack to file
save heap to file
resume
main()
restore()
restore stack
restore heap
take jump
Optimization: Incremental
Checkpointing


Observation: between taking two
checkpoints, only a portion of the
memory has actually been changed
Optimization: save only what has been
changed since last ckpt, the rest can be
read from previous ckpts
Taking Incremental Ckpts.




After taking a ckpt (and after init.), set
protection on all pages to ‘read-only’
Write to page will cause a protection violation
Libckpt library catches that signal, and sets
page protection to ‘read-write’, page is
marked as dirty
When writing checkpoint file, only write dirty
pages
Drawbacks to Incremental Ckpt


Required to keep multiple copies of the
checkpoint file
On recovery, will unnecessarily restore
old copies of data
Optimization: Asynchronous
Checkpointing


Observation: the process must be
suspended while the checkpoint file is
written
Optimization: a separate thread could
write the checkpoint file while the main
thread was allowed to continue
Asynchronous Checkpointing

Make a copy of the process space

2nd thread takes writes copy to disk

1st thread continues without halting
Asynchronous Checkpointing(2)





Unix fork() provides the necessary
behavior
When about to take ckpt, process forks
OS makes a complete copy of the
original process’ space
Clone writes ckpt file, then dies
Original continues computing
Copy-On-Write Checkpointing


Like asynchronous checkpointing, but
only copy page if the two versions are
about to differ
Some (most?) OS implement fork() in
this manner, so benefit is automatic
Checkpoint Compression




Use a standard data compression algorithm
to shrink the size of the checkpoint file
Only improves overhead if the speed of
compression is faster than the speed of disk
writes, and compression is significant
“For uniprocessor checkpointing, this is not
the case”
Not implemented in libckpt
User Directed Checkpointing




As described so far, libckpt is (almost)
entirely transparent to the programmer
Compare to application level checkpoint
requiring extensive code changes
Is there a middle ground?
Libckpt allows programmers to annotate
application code with directives that
guide the checkpointing
Memory Exclusion

Certain areas of memory can be excluded
from the checkpoint




Dead memory – will never be read or written
Clean memory – values have not changed since
previous checkpoint
Incremental Ckpt provides clean memory opt.
at a coarse level (page size)
Only writing the ‘active’ areas of the stack
and heap provides dead memory opt.
User Directed Memory
Exclusion

Libckpt provides the app. programer
with two functions

exclude_bytes(ptr, length, usage)


Specify an area of memory to exclude from
future checkpoints
include_bytes(ptr, length)

Add a previously excluded area of memory to
future checkpoints
Clean Memory

If mem is clean




exclude_bytes(mem, …, CKPT_READONLY)
Include mem in next checkpoint, but
exclude in all subsequent
Cannot write to mem until after call to
include_bytes(mem)
Restore last saved version of mem
Clean Memory: Example
for (…)
{
A = init_A()
exclude_bytes(A,…,CKPT_READONLY)
do_stuff(A) //assuming A does not change
include_bytes(A…)
}
Dead Memory

If mem is dead




exclude_bytes(mem, …, CKPT_DEAD)
Do not checkpoint mem
Cannot read mem until after
include_bytes(mem)
Will not restore mem
Dead Memory: Example
for (…)
{
A = init_A()
do_stuff(A)
exclude_bytes(A…DEAD)
do_other_stuff() // assumes will not read A
include_bytes(A)
}
Using Memory Exclusion


There can be a dramatic reduction in
the size of the checkpoint file
Must be used very carefully

Inadvertently excluding a live region from a
checkpoint could cause erroneous behavior
on restart
Synchronous Checkpointing

At different points in the program’s
execution the amount of ‘live’ state
varies widely



The stack might be much smaller
(shallower call graph)
Heap items might have been de-allocated
Regions of memory might be dead or clean
Synchronous Ckpt (2)



If checkpoints are taken at times where
there is relatively little live state, the
checkpoint file size (and overhead) will
be smaller
Allow user to specify where in a
program a checkpoint should be taken
Independent of timers (signals)
Sync. Ckpt. Example
for (…)
{
checkpoint_here()
A = malloc(…)
do_stuff(A)
free A
}
Synchronous Ckpt (3)


To avoid checkpointing too frequently,
mintime parameter specifies the
minimal amount of time between two
checkpoints
If checkpoint_here() is called less than
mintime seconds after the last
checkpoints, the call is ignored
Synchronous Ckpt (4)


To ensure that checkpoints are taken
frequently enough to be of use,
maxtime parameter specifies the
maximum time allowed to elapse
between two checkpoints
If maxtime passes, an asynchronous
checkpoint is taken
Combining Mem. Exclusion
and Sync. Checkpointing
main(){
D = malloc
f = file
while(!done){
D = read(f)
perform_calc(D)
output_result()
}
}
ckpt_target(){
D = malloc
f = file
while(!done){
D = read(f)
perform_calc(D)
output_result()
exclude_bytes(D, DEAD)
checkpoint_here()
include_bytes(D)
}
}
Download