Pablo Project

advertisement
Pablo Project
• http://www-pablo.cs.uiuc.edu/Projects/Pablo/
• Goal: portable performance data
environment for parallel systems
• Pablo Version 5.0 components
– SDDF Library
– TraceLibrary
– I/O Analysis programs
– Analysis GUI
– SvPablo
Self Defining Data Format -SDDF
• Performance data description language that
specifies both data record structures and
data record instances
• Supports definition of records containing
scalars and arrays of the base types found in
most programming languages
• Developed to link Pablo instrumentation
software to Pablo analysis environment
SDDF (cont.)
• Goals - compactness, portability, generality,
extensibility
• ASCII and binary formats (binary contains flag
indicating byte ordering)
• SDDF interface library -- library of C++ classes
for writing and interpreting files in SDDF format
• FileStats utility -- shows types of records and
range of values appearing in SDDF file
SDDF Example
// “description” “IO Seek”
“Seek” {
// “Time” “Timestamp”
int
“Timestamp”[];
// “Seconds” “Floating Point Timestamp”
double
“Seconds”;
// “Event ID” “Corresponding event”
// “700013” “lseek”
// “700015” “fseek”
int
“Event Identifier”;
// “Node” “Processor number”;
int
“Processor Number”;
// “Duration” “Event duration in seconds”
double
“Duration”;
// “File ID” “Unique file identifier”
// “Number Bytes” “Number of bytes traversed”
int
“Number Bytes”;
// “Offset” “Byte offset from position indicated by Whence”
int
“Offset”;
// “Whence” “Indicates file position that Offset is measured from”
// “0” “SEEK_SET”
// “1” “SEEK_CUR”
// “2” “SEEK_END”
int
“Whence”;
;;
SDDF Example (cont.)
“Seek” {
[2] {
201803857,
0
}, 20.1803857, 70013, 0, 0.0031946, 3, 0, 0, 0 };;
Pablo TraceLibrary
• Basic trace library with extensions for
procedure tracing, loop tracing, NX
message passing tracing, I/O tracing, MPI
tracing
• Basic trace library
– functions traceEvent, countEvent, startTimeEvent,
endTimeEvent
– event ID specifies type of event that is being traced
Pablo TraceLibrary (cont.)
• Extensions provide wrapper functions for
management of event ID’s for various event types
• Procedure and loop tracing done manually by
inserting calls to TraceLibrary routines into
application source code
• Default mode is to dump trace buffer contents to a
trace file, but it’s possible to have trace data output
sent to a socket for real-time analysis
TraceLibrary Scalability
• Documentation states that TraceLibrary monitors and
dynamically alters volume, frequency, and types of event
data by
– associating a user-specified maximum trace level with
each event and
– substituting less invasive data recording (e.g., event
counts rather than complete event traces) if maximum
user-specified rate is exceeded
• Unclear if these measure are taken automatically by highlevel trace library or if they must be explicitly called by
user at low level
I/O Extension to TraceLibrary
• I/O instrumentation requires changes to
application source code
• I/O trace initialization and termination routines
must be called before and after calling any other
I/O trace routines
• I/O trace bracketing routines provided for I/O
requests that are not implemented as library calls
(e.g., getc macro in C and Fortran I/O statements
that are part of the language)
I/O Extension (cont.)
• I/O instrumentation options for C programs
– Manually replace standard I/O calls with
tracing counterparts
– Define IOTRACE so that pre-processor
replaces standard I/O calls with tracing
counterparts
• I/O instrumentation of Fortran programs
– Manually bracket each I/O call with I/O trace
library bracketing routines
I/O Extension (cont.)
• Programs containing to I/O extension
interface routines must be linked with
– Pablo Trace Extension Library
libPabloTraceExt.a
– Pablo Base Trace Library libPabloTrace.a
Sample C program - No Instrumentation
#include <stdio.h>
#include <stdlib.h>
main()
{
FILE
*fp;
char
buffer[1024];
size_t cnt;
fp = fopen(“/etc/motd”, “r”);
if (fp != NULL) {
cnt = fread(buffer, sizeof(char), 1024, fp);
fclose(fp);
}
}
Sample C program - Manual Instrumentation
#include “IOTrace.h”
#include <stdio.h>
#include <stdlib.h>
main()
{
FILE
char
size_t
*fp;
buffer[1024];
cnt;
initIOTrace();
/* Initialize I/O Extension */
fp = traceFOPEN(“/etc/motd”, “r”);
if (fp != NULL) {
cnt = traceFREAD(buffer, sizeof(char), 1024, fp)
traceFCLOSE(fp);
}
/* Trace termination routines */
endIOTrace();
endTracing();
}
Sample C program - Preprocessor Replacement
#define IOTRACE
#include “IOTrace.h”
#include <stdio.h>
#include <stdlib.h>
main()
{
FILE *fp;
char
buffer[1024];
size_t cnt;
initIOTrace();
/* Initialize I/O Extension */
fp = fopen(“/etc/motd”, “r”);
if (fp != NULL) {
cnt = fread(buffer, sizeof(char), 1024, fp)
fclose(fp);
}
/* Trace termination routines */
endIOTrace();
endTracing();
}
Sample Fortran program - No Instrumentation
integer i
open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’)
i=0
write(2, 100) I
close(2)
100 format(‘Node ‘, i3)
end
Sample Fortran program - Manual Instrumentation
#include “fIOTrace.h”
integer I
call initIOTrace()
call traceOpenBegin(‘/tmp/f’, i)
open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’)
call traceOpenEnd(2)
i=0
call traceWriteBegin(2,1,0)
write(2, 100) I
call traceWriteEnd(9)
call traceCloseBegin(2)
close(2)
call traceCloseEnd()
100 format(‘Node ‘,i3)
call endIOTrace()
call endTracing()
end
MPI TraceLibrary Extension
• MPI profiling library that can be linked in
without making source code changes
• Each MPI process output a trace file labeled
with the process number
• Insert call to SetTraceFileName()
immediately after MPI_Init() to control
location of trace file
MPI Extension (cont.)
• Disable tracing by calling MPI_Control(0)
• Re-enable tracing by calling
MPI_Control(1)
• Link with Pablo Trace Extension Library
(libPabloTraceExt.a) and Pablo Base Trace
Library (libPabloTrace.a)
• Merge per-process trace file using the
SDDF utility MergePabloTraces
Pablo Trace File Analysis
• Command-line FileStats program scans SDDF file
and reports record types, min and max values for
each field, and count of each record type.
• SDDFStatistics GUI for generating and browsing
statistics from an SDDF file
• Pablo I/O analysis command-line routines
• Pablo Analysis GUI
SDDFStatistics
• Statistics for entire file are displayed along top of
display
• Record types are displayed in panel at lower left
• Clicking on a record type brings up statistics for
each field of that record type
• Clicking on a field displays a histogram
summarizing values for that field
• Clicking on an array field type brings up statistics
for each dimension of that field
SDDFStatistics display
SDDFStatistics Usage
• SDDFStatistics [-toolkitoption …] [-loadSummary
filename] [-openSDDF filename]
• Or use runSDDFStatistics script which invokes the
SDDFStatistics program after setting environment
variables so that required resources can be located
I/O Analysis Programs
• Iostats generates a report of application I/O
activity summarized by I/O request type.
• IOstatsTable produces table summarizing
information about I/O operations.
• IOtotalsByPE produces a report showing
the total count, duration, and bytes involved
for various operations by processor.
I/O Analysis Programs (cont.)
• LifetimeIOstats produces a report summarizing
I/O activity by processor and file, prints a
histogram of the file lifetimes, and prints total
time spent in I/O calls for each procedure.
• FileRegionIOstats generates a report of
application I/O activity summarized by file region.
Each file is divided spatially into regions whose
size is set by calling
enableFileRegionSummaries().
I/O Analysis Programs (cont.)
• TimeWindowIOstats produces a report from
Time Window Summary trace records. The
execution time of the program is divided into time
windows whose size is set by calling
enableTimeWindowSummaries().
• SyncIOfileIDs processes a trace file contining I/O
trace events where many different file Ids may be
associated with a given file, and write a new file
where every I/O trace event associated with a
particular file (as determined by the file name) has
the same file ID.
I/O Characterization Research
using Pablo
• Detailed characterization of I/O behavior of
scalable applications and existing parallel file
systems
• Goals
– Enable application developers to achieve higher
fraction of peak I/O performance on existing
parallel file systems
– Help system software developers design better
parallel file systems
I/O Research (cont.)
• Target Platforms
–
–
–
–
Intel Paragon
IBM SP
Convex Exemplar
SGI Origin 2000
I/O Research (cont.)
• The Scalable I/O (SIO) Initiative has targeted a
number of application codes for study, including:
– PRISM incompressible Navier-Stokes
calculations
– SAR Synthetic Aperture Radar application
– HF Hartree-Fock calculations
– ESCAT SMC electron scattering
– RENDER ray-identification rendering
Pablo and Virtual Reality
• Problem
– Very large volume of captured performance
data for parallel systems
– Human-computer interface is bandwidthlimited
• Proposed solution
– Immerse users in virtual world so that users can
explore, viscerally experience, and modify the
dynamic behavior of application and system
software on a massively parallel system
Avatar
• Pablo virtual reality system
• Operates with workstation monitor, head-mounted
display, and the CAVE
• Presentation metaphors
– Scattercube Matrix
• generalization of 2-d scatterplot matrix
• shows 3-d projections of sparsely populated, Ndimensional space
– Time Tunnel
• event level display of processor and inter-processor
behavior
Pablo Analysis GUI
• Toolkit of data transformation modules capable of
processing SDDF records
• Supports graphical connection of performance
data transformation modules in style of AVS
• By graphically connecting modules and
interactively selecting trace data records, user
specifies desired data transformation and
presentations
• Expert users can develop and add new data
analysis modules
Analysis GUI (cont.)
• Module types
– Data analysis
• Mathematical transforms (counts, sums, ratios, max,
min, average, trig functions, etc.)
• Synthesis of vectors and arrays from scalar input
data
– Data presentation - bar graphs, bubble charts,
strip charts, contour plots, interval plots, kiviat
diagrams, 2-d and 3-d scatter plots, matrix
displays, pie charts, polar plots
Pablo Analysis GUI Main Window
Module Creation Window
Module Connection
Configuring a Module (BarGraph)
Graph Execution
Graph with Synthesize Vector Module
Download