PPT

advertisement
Advanced I/O Techniques for
Efficient and Highly Available
Process Crash Recovery
Protocols
Thesis Presentation
Jason Cornwell
03/15/2011
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Computing Intensive Applications
Network Centric Services
Recent Advances
Motivation & Goals
Demand for more computing power and
high-bandwidth network connections
Advances in Microprocessors and Networks
Parallel Computing
Performance
and
Scalability
Reliability
and
Availability
Simplicity
and
Accessibility
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Reliability Problems
Large numbers of CPUs, Memory
Modules, Hard Disk Drives, Network
Interfaces, Network Switches
Low Mean-Time-To-Failure (MTTF)
and/or
High Failure-In-Time (FIT)
Classification of Failure
• Transient Failure
– Power glitch
– System patch and reboot
– ECC trap
• Partial “Permanent” Failure
– Disk failure
– Partial network failure
• Wholesale “Permanent” Failure
– Total hardware failure
– Natural disaster
Availability Problems
Large numbers Processes, Threads,
Software Barriers, Busy Waiting
Temporarily Unresponsive
and/or
Unavailable
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Possible Solutions
• Transient Failure
– Restart/replay/resume on the same node
– Task-migration is possible
• Permanent Partial Failure
– Rebalance the workload on surviving nodes
– Partial task-migration is needed
• Permanent Wholesale Failure
– Reconfigure the applications and services
– Massive task-migration to new platform
Checkpointing
• Common feature in high-performance
computing (HPC) platforms
• Saves the execution state
• Application or system-level
• Mechanism for task migration
Application vs System Level
• Application-level Recovery Point
– Developed application specific
– Generally smaller footprint
– Data accessiblity restrictions
• Kernel-level Recovery Point
– Snapshot processes
– Full resource restoration
– Flexibility due to system level preemption
Berkeley Labs Checkpoint/Restart
•
•
•
•
•
•
System-level
Kernel-module
Checkpoint creation implemented
Process recovery implemented
Linked to BLCR libraries at execution
Stores checkpoint data locally (stack,
heap, registers, signals, etc.)
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Contribution
• Enhanced BLCR performance through
latency tolerant technique
• Increased BLCR availability through novel
checkpoint creation technique
I/O Optimization
• Avoided extreme modification to BLCR
• Reduce the disk latency of checkpoint
creation
• Implemented a caching technique
• Improved I/O performance 4-fold or more
• System overhead less than 300KB in
experimental test results
Checkpoint Caching
• Buffer used as
temporary storage
• Storage block flushed
in large volume
• Trade-off between
resource consumption
and improved I/O
efficiency
cr_copy(chkptData, count)
if(chkptBuf is NULL)
kmalloc size of count for
chkptBuf space;
copy chkptData into chkptBuf;
else
kmalloc size of count +
chkptBuf size for tempBuf
space;
copy chkptBuf into tempBuf;
krealloc chkptBuf for its
expanded size;
memmove tempBuf into chkptBuf;
kfree memory for tempBuf;
end if
Optimized Write Operation
Remote Checkpoint
• BLCR is limited to local disk storage
• Remote checkpoint offers off-site storage
option
• Uses sockets to transmit data
• Needs predefined destination
• Outperforms BLCR in some experimental
tests
Remote Checkpoint Server
• Single thread daemon
• Used GCC compiler
• Stores the recovery
point external to the
client node
• Could be ported to
Microsoft derivative
while(true)
create socket;
bind to address;
listen for incoming
connections;
wait for client to connect;
create file descriptor;
while(data buffered received)
write checkpoint data;
close file descriptor;
close socket;
Modified Write Operation
• TCP packets
• MTU must be
reached before
delivery
• Only modification is to
the write operation of
BLCR
if(remote chkpt)
if(socket is NULL)
create socket;
establish connection, if
handshake fails break and
perform the original_chkpt;
end if
package checkpoint data;
send data message;
end if
if(original_chkpt)
original BLCR write
operation;
end if
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Design
I/O Optimization Write
write(chkptData, count)
if(chkptBuf has space for
the incoming chkptData)
cr_copy(ckptData, count);
else
vfs_write(chkptBuf);
vfs_write(chkptData);
kfree(chkptBuf);
end if
Remote Checkpoint Write
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Experimental Setup
I/O Optimization
Remote Checkpoint
•
•
•
•
Dell Workstation, 3.06 GHz Intel
Pentium 4, 1 GB Memory, 5,400
RPM Hard Disk, Linux 2.6
BLCR Implementation
Optimized BLCR (O-BLCR)
Implementation
•
•
•
•
Dell PowerEdge 700, 2.80 GHz
Dual-processor Intel Pentium 4,
3 GB Memory, 5,400 RPM Hard
Disk, Linux 2.6
Dell Workstation, 3.06 GHz Intel
Pentium 4, 1 GB Memory, 5,400
RPM Hard Disk, Linux 2.6
BLCR Implementation
BLCR with NFS (BLCR+NFS)
BLCR with our Remote
Checkpoint Technique
(BLCR+R)
Benchmarks
Resource Utilization
Program
•
•
•
•
Benchmark
CPU
Memory
I/O
TSP
High
Low
Low
AES
High
Low
Medium
GE
Low
High
High
Medium
Medium
Medium
NP-Complete
HC
Data Encryption
Linear Equation Solver
File Compression
I/O Optimization Results
Remote Checkpoint Results
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Conclusion
• Minimal modification to BLCR
• I/O optimization technique reduced the
write latency of BLCR
• Remote checkpoint increases BLCR
availability with new feature
• These techniques should be deployed into
the foundation of BLCR source code
Agenda
•
•
•
•
•
•
•
•
Introduction
Challenges
Pertinent Background
Proposed Techniques
Implementations
Experimental Setup & Results
Conclusions
Future Work
Future Work
• Server authentication protocol
• Data packet encryption
• Automated process load balancing
Questions
Download