Slide 1

advertisement
PinPlay: A Framework for Deterministic Replay
and Reproducible Analysis of
Parallel Programs
Harish Patil, Cristiano Pereira, Mack Stallcup,
Gregory Lueck, James Cownie
Intel Corporation
CGO 2010, Toronto, Canada
Software & Services Group
1
Non-Determinism
• Program execution is not repeatable across runs
– Interactions with environment (single-threaded)
– Shared-memory interleaving (multi-threaded)
• Source of many problems
– Hard to predict and test behaviors -> leads to bugs
– Very hard and unpleasant to debug
– Breaks program analyses that rely on repeatability
• Obstacle for adoption of parallel programming
Software & Services Group
2
Dealing with Non-Determinism
• Eliminate it
– Deterministic program execution enforced by runtime
(e.g. constrained execution [ISCA’09])
• Deterministic Replay
– Let it be but capture and reproduce execution if
needed
– Every instruction gets same input as in original run
• This paper: User-level Deterministic Replay
– Implementation, challenges and usage examples
Software & Services Group
3
Requirements
•
•
•
•
•
•
No OS or hardware changes
No changes in user environment
Manageable log sizes for long runs
Reasonable run-time overhead
Multi-threaded and multi-processed applications
Integration with other existing analysis tools (e.g.
Dynamic analyzers, debuggers, profilers)
• No assumptions about synchronization APIs
Software & Services Group
4
Rest of the Talk
•
•
•
•
•
Motivation & Requirements
PinPlay Overview
Usage Examples
Results
Summary
Software & Services Group
5
PinPlay
replay
capture
User-level deterministic replay and analysis
Binary +
Input
Normal
Program Output
PinPlay
+
Logs
(pinballs)
OS (Linux® or Windows®)
Logs
(pinballs)
Analysis Tools
PinPlay
+
Debuggers
OS (Linux® or Windows®)
 Run in application’s
native environment
 Replays user code
 OS independent:
cross-OS replay!
 Easily integrates w/
other tools and
debuggers
Software & Services Group
6
Replay Models
• Parallel-capture and parallel-replay
T0 T1 T2
T0 T1 T2
PinPlay
Logs
(pinballs)
PinPlay
• Parallel-capture and isolated-replay
PinPlay
T0 T1 T2
PinPlay
Logs
Logs
Logs
(pinballs)
(pinballs)
(pinballs)
T0
PinPlay
T1
PinPlay
T2
Software & Services Group
7
Information Captured For Replay
All memory Values
1. Subset of Memory Values
• Shadow-memory to capture first reads
without prior writes and OS side-effects
automatically [Sigmetrics’06]
• Values changed by remote threads
2. Initial registers and OS
register side-effects:
• Signals/Exceptions/APCs/system calls
3.
4.
5.
6.
8
Reads without prior writes
OS side-effects used by app
Values from remote threads
All other values (not captured)
Code executed (user and libraries)
Position of code and stack
Output of some instructions (e.g. RDTSC)
Subset of shared-memory access interleaving
(transitive opt. - FDR [ISCA’03])
Software & Services Group
PinPlay Architecture
User Land
pinball
Application
code and data
Your Pin-based Tool
PinPlay Lib
Logger
Instrumentation and
analysis to capture logs
Replayer
Instrumentation and analysis
to inject side-effects
Intel’s Pin (JIT compiler and instrumentor) *
OS (Linux® or Windows®)
Capable of logging, replaying and relogging
execution (recapture from a replaying run)
9
* http://www.pintool.org/
Software & Services Group
Cross-OS Replay and Challenges
• Log on one OS and replay on another
• System call translations
– Most OS activity does not happen on replay (only sideeffects restored)
– Semantics is translated across OSes (e.g. create thread)
• Memory mapping
– Problem: address space different across OSes
– Solution: use Pin’s Fetch API to redirect code and
memory operand rewriting to redirect data
address space
on Windows®
code
code
data
data
address space
on Linux®
Software & Services Group
10
Usage Example: Program Analysis
• Sampling and checkpointing for simulation
Multi-process
MPI program
– One run for profiling and finding representative
regions, another for checkpointing
– Requirement: both runs must be identical
PinPlay
Checkpoints
for simulation
Logs
Logs
Per-Process
(pinballs)
(pinballs)
pinball
PinPlay +
Checkpointer
PinPlay +
Profiler
Per-Process
pinball
Representative
Regions
• Pinballs are used to share workloads for Pinbased analyses among architects Software & Services Group
11
Usage Example: Replay for Debugging
• Capture a buggy run and replay under debugger
–
–
–
–
Guaranteed to reproduce the bug and helps root causing
Works w/ off-the-shelf unmodified debuggers (e.g. GDB)
PinPlay based tool extends GDB commands w/ your own
Limitation: debugger can’t change control-flow
• Used to debug various multi-threaded applications
• Also using it for in-house debugging of concurrency
issues with a major database vendor
Logs
(pinballs)
PinPlay Enabled
Debugger Tool
Intel’s Pin
GDB
remote
(unmodified)
protocol
Binary
Software & Services Group
12
Results
Slowdown relative to
Native
Logger Slowdown
Size (MB)
39
91
396
2140
1116
5222
1996
Replayer Slowdown
160
140
120
100
80
60
40
20
0
Software & Services Group
13
Isolated replay
Benchmark/Application
Average Icount (Billions)
SPEC2006 (single-threaded)
924
SPECOMP2001 (4-threaded openmp)
307
McBench (4-threaded RMS)
156
MILC-8p (numerical simulator/MPI)
109
POP-8p (ocean circulator model/MPI)
952
WRF-8p (Weather Prediction/MPI)
755
EnergyApp-8p (Energy Exploration/MPI)
693
Sources of Slowdown
• Instrumentation of every memory operation to
identify system call side-effects and log data
– Could be done by OS at the cost of OS modification
or OS-specific analysis (doesn’t work on Windows®)
• Locks for shadow-memory accesses
– Could be eliminated by using a shadow-copy per
thread at the cost of significant increase in log sizes
• Other optimizations possible (please look at the
paper)
Software & Services Group
14
Summary
• User-level deterministic capture and replay
– No OS changes, special hardware, or virtualization
– Integrates w/ other Pin-tools for repeatable analysis
and debugging
• Replay occurs on any machine and works
across OSes (Windows to Linux)
• Pinballs are OS-independent and self-contained
– Ideal for sharing workloads among researchers, for
Pin-based analyses
• We will release PinPlay libraries in future
Software & Services Group
15
Q&A
Software & Services Group
16
Download