Speculation Supriya Vadlamani CS 6410 Advanced Systems

advertisement
Speculation
Supriya Vadlamani
CS 6410 Advanced Systems
Introduction : Authors
• Edmund B Nightingale
– Microsoft Research(currently)
– Speculator -thesis
• Jason Flinn
– University of Michigan
• Peter M Chen
– University of Michigan
Edmund B. Nightingale, Peter M. Chen and Jason Flinn
SPECULATIVE EXECUTION IN
DISTRIBUTED FILE SYSTEM
Distributed File System : basics
• allows access to files located on another remote host
as though working on the actual host computer
• Does it perform worse than local file system?
– YES , Synchronous I/O
– Cache Coherence :
Sync n/w messages provide consistency
– Data Safety
Sync disk writes provide safety
Proposal
• Solutions :
1. Sacrifice guarantees to gain speed
2. Speculation with OS support
• Speculation
– DFS can be safe, consistent and fast
• Light weight checkpoints
• Speculative execution
• Tracking causal dependencies
Conditions for success
• Time for lightweight Check point < n/w round trip time
– Around 52 microseconds
• Modern Computers can afford to execute speculatively
– Spare resources: CPU cycles and memory
• File system clients can predict the outcome correctly.
– Conflicts are rare
– Very few concurrent updates
Speculation in NFS
NFS without speculation
NFS with speculation
Speculation Interface
1) System call
2) Create speculation
3) Fail speculation
fail_Speculation()
Time
create_Speculation()
3) Commit speculation
commit_Speculation()
Spec Id
Spec
Spec Id
Undo log
Advantages of the Interface
• Application independent
• Speculation success/failure cab be declared by
any process
• Abstraction over the hypothesis underlying
each speculation
Implementing Speculation
1) System call
2) Create speculation
Time
Checkpoint
Undo log
Spec
Speculation Success
1) System call
2) Create speculation
3) Commit speculation
Time
Checkpoint
Undo log
Spec
11
Speculation Failure
1) System call
2) Create speculation
3) Fail speculation
Time
Checkpoint
Undo log
Spec
Implementing Speculation
• Data Structures
– Speculation structure
• Set of kernel objects
– Depend on speculation
– Undo log
• Ordered list of speculative objects
Implementing Speculation
• Speculative process can
– Call System calls that don’t modify state (Getpid())
• Can modify calling process’ state : dup2
– Can perform file system operations
• If flag is set
• Correct Speculation Execution
– Invisible to user or external device
– Process shouldn’t view speculative state unless it
is speculatively dependent
Implementing Speculation
Ensuring Correctness
• Issues:
– External state
• Displaying a message on to the console
• Sending a message over the network
• Solutions:
– Propagate dependencies
– Buffer
– Block the process
Multi-process Speculation
[Speculative processes and IPC]
• Processes often cooperate
Example: “make” forks children to compile, link, etc.
Would block if speculation limited to one task
• Allow kernel objects to have speculative state
– Examples: inodes, signals, pipes, Unix sockets, etc.
– Propagate dependencies among objects
– Objects rolled back to prior states when specs fail
Multi-process Speculation
[Speculative processes and IPC]
Checkpoint
Spec 1
Spec 2
Checkpoint
Checkpoint
pid 8000
pid 8001
Chown-1
Write-1
inode 3456
Multi-process Speculation
[Speculative processes and IPC]
Speculator supports:
• DFS objects
• RAMFS
• Local file system objects - Ext3
• Pipes & FIFOs
• Unix Sockets
• Signals
• Fork & Exit
Handling Mutating Operations
Client 1
Client 2
1. cat foo > bar
2. cat bar
• bar depends on cat foo
• What does client 2 view in bar?
Simple Solution:
restricted nature of communication in a server-based DFS
Handling Mutating Operations
• Server always knows the true state of the file
system;
– Client includes the hypothesis underlying that
speculation.
– Server: Evaluates the hypothesis underlying the
speculation
• If hypothesis is valid.
– Mutation is performed
Else
– fails the mutation
Handling Mutating Operations
• Eg: BlueFS client : check version RPC [version number of
its cached copy foo]
Server: checks this version number against current
version
fails the speculation if the two differ
Server: If previously failed any of the listed
speculations, it fails the mutation.
• Causal dependencies:
– set of speculations associated with the undo log of prior
processes
– List returned by create speculation and included in any
speculative RPC sent to the server.
Speculative group commit
• Parallelize writes to disk by grouping them
Client
write
commit
write
commit
Server
Client
Server
SPEC NFS
• preserves existing NFS semantics, including
close-to-open consistency.
• Issues the same RPCs, many of these RPCs are
speculative
NFS: Security is still an issue!
Blue FS
Features:
– Single copy semantics
– Synchronous I/O
• Each file, directory, etc. has version number
– Incremented on each mutating op (e.g. on write)
– Checked prior to all operations.
– Many ops speculate and check version async
Evaluation
• Two Dell Precision 370 desktops as the client and file
server
• Each machine has a 3 GHz Pentium 4 processor, 2GB
DRAM, and a 160GB disk.
• To insert delays, we route packets through a Dell
Optiplex GX270 desktop running the NISTnet [4]
network emulator.
• Ping time between client and server is 229 s.
Results : Apache Benchmark
300
4500
NFS
SpecNFS
BlueFS
ext3
Time (seconds)
250
4000
3500
3000
200
2500
150
2000
1500
100
1000
50
500
0
0
No delay
30 ms delay
Time (seconds)
Cost of Rollback
140
2000
120
1800
1600
100
1400
80
1200
60
1000
800
40
600
20
400
200
0
0
NFS
SpecNFS
No delay
ext3
No files invalid
10% files invalid
50% files invalid
100% files invalid
NFS
SpecNFS
30ms delay
ext3
Time (seconds)
Group Commit & Sharing State
4500
500
450
400
350
300
250
200
150
100
50
0
4000
Default
3500
No prop
3000
No grp commit
2500
No grp commit & no prop
2000
1500
1000
500
NFS
SpecNFS
0 ms delay
BlueFS
0
NFS
SpecNFS
30ms delay
BlueFS
Discussion
• Speculation: not a new concept
– Used for hardware
– Is speculation in OS a good idea?
• Server handles the speculation
– Server crashes ?
Edmund B. Nightingale, Kaushik Veeraraghavan ,Peter M. Chen and Jason
Flinn
RETHINK THE SYNC
Synchronous I/O v/s Asynchronous I/O
• Very slow
• Applications a
frequently blocked
(decreases performance
by 2 orders of
magnitude)
• Fast, not safe
• Does not block an
application
• Complicates applns tht
require durability and
reliability
Despite, the Asynchronous I/O ‘s poor guarantees, users
prefer asynchronous I/O because synchronous I/O is too slow!
External Synchrony
• External Synchrony
– Provides the reliability & simplicity of synchronous
system
– Closely approaches the performance of
asynchronous system
• Synchronous I/O: Application centric view
• External Synchrony: User centric view
Example: Synchronous I/O
101
write(buf_1);
102
write(buf_2);
103
print(“work done”);
104
foo();
Application blocks
Application blocks
%
%work
done
%
Process TEXT
OS Kernel
Disk
Observing synchronous I/O
101
write(buf_1);
102
write(buf_2);
Depends on 1st write
103
print(“work done”);
Depends on 1st & 2nd write
104
foo();
Sync I/O externalizes output based on causal ordering
Enforces causal ordering by blocking an application
Ext sync: Same causal ordering without blocking applications
Example: External synchrony
101
write(buf_1);
102
write(buf_2);
103
print(“work done”);
104
foo();
%
%work
done
%
Process TEXT
OS Kernel
Disk
Optimizations to External Synchrony
• Two modifications are grouped and
committed as a single file system transaction
• Buffer screen output
Causal dependencies need to be resolved to
between file system modifications and external
output
Limitations of External Synchrony
• Externally synchronous system can propagate
failures using a speculator to checkpoint a
process before modifications
• User may have temporal expectations about
modifications committed to a disk
• Modifications to data in two different file system
cannot be committed in a single transaction
Evaluation
Implemented ext sync file system Xsyncfs
Based on the ext3 file system
Use journaling to preserve order of writes
Use write barriers to flush volatile cache
Compare Xsyncfs to 3 other file systems
Default asynchronous ext3
Default synchronous ext3
Synchronous ext3 with write barriers
Postmark Benchmark
10000
Time (Seconds)
1000
100
10
1
ext3-async
xsyncfs
ext3-sync
ext3-barrier
New Order Transactions Per Minute
MYSQL Benchmark
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
xsyncfs
ext3-barrier
0
5
10
Number of db clients
15
20
SPEC web99 Throughput
400
Throughput (Kb/s)
350
300
250
200
150
100
50
0
ext3-async
xsyncfs
ext3-sync
ext3-barrier
Discussion
• Xsyncfs:
– Performs several orders better than existing solns.
– Why isn’t it widely used?
• Checkpoints:
– How far can you go?
Conclusion
• Speculation in DFS
– Spec NFS
• Security is still an issue
• Same authors examined security issues wrt speculation in later
works
– Blue FS
• Consistent and single copy semantics
• Rethink the sync : Xsyncfs
• a new model for local file I/O that provides the reliability and
simplicity of synchronous I/O
• approximates the performance of asynchronous I/O
• But limitations of external synchrony exist
References
• Edmund B. Nightingale, Peter M. Chen, Jason Flinn, "Speculative Execution
in a Distributed File System", Proceedings of the 2005 Symposium on
Operating Systems Principles (SOSP), October 2005
• Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, Jason
Flinn, "Rethink the sync", Proceedings of the 2006 Symposium on
Operating Systems Design and Implementation (OSDI), November 2006
• Edmund B. Nightingale, Daniel Peek, Peter M. Chen, Jason Flinn,
"Parallelizing security checks on commodity hardware", Proceedings of the
2008 International Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS), March 2008
• Wikipedia
Download