Path Profiling for Y2K

advertisement
The Use of Program
Profiling for Software
Maintenance with
Applications to the Year
2000 Problem
Thomas Reps, Thomas Ball,
Manuvir Das, and James Larus
Presented by Amy Sliva
The Y2K Problem


Many computer programs used two digit date
representations
The year 2000 could be interpreted as 1900
 Madness

Other date related problems as well
 Leap

and mayhem would reign
year issues
DARPA asked Reps to help plan a project to
reduce the impact
 What
technology could help in addition to present
commercial products?
Use Path Profiling to Help

Determine the sites at which date-manipulation code
occurs
 Dates
are hidden in programs
 Crucial to the creation of effective tools for correcting Y2K
problems



Determine whether COTS components or tools have
date problems
Testing of post-renovation code
Can distinguish more behavioral differences than
node- or edge-profiling
Path Profiling

Instrument program to count the number of times
different paths are executed
 Paths


of interest are loop-free, intraprocedural
Distribution of paths from an execution is called a
path profile or path spectrum
Differences between spectra from different runs
can identify date-dependent computations
 Are
different paths executed using pre-2000 and post-2000
date input?
Example 1
Differences between spectra


Path-spectrum-comparison reveals paths from
new_spectrum not found in old_spectrum and vice
versa
Determine the shortest prefix of paths in
new_spectrum but not old_spectrum
 Portion

of a path representing a different computation
Gather information on paths executed different
numbers of times in the two spectra
 A threshold ratio
(i.e., 100 to 1) identifies interesting paths
Finding the shortest path prefix


Use a trie structure on a path spectrum to find
shortest prefix
The first edge that deviates from the trie is the last
edge of the shortest prefix
Efficient Path Profiling

Ball and Larus algorithm with overheads of 30-40%
 Numbering

scheme applied to acyclic control-flow graph
Ball-Larus labels the graph with two quantities:
 Each
node W is labeled with the value num_paths_from(W)
 Each edge is labeled with a value derived from
num_paths_from

Each path will end up with a unique number using
these two quantities
Ball-Larus Algorithm


Backward dataflow analysis
Nodes labeled with num_paths_from
 Exit
labeled 1
 num_paths_from(W) = num_paths_from(W1 + … + Wk)
Ball-Larus Algorithm (cont.)

Number edges such that every path from Start to Exit
has a unique sum of edge labels in the range
[0…num_paths_from(Start) - 1]

an edge WWi, vi is the sum of the number of paths to
Exit from all successors of W that are to the left of Wi
 For
Ball-Larus Algorithm (cont.)
Example 2
Finding Path Prefixes from BallLarus labels



p is a path in new_spectrum that does not occur in
old_spectrum
Index structure supporting range queries is built
on old_spectrum
Sequence of queries is issued to determine if
ranges are empty
 IsRangeEmpty(S,a,b)
the range [a…b]
= true if S does not contain values in
Finding Path Prefixes from BallLarus labels




Paths in range [c…c +
num_paths_from(W)-1] and
prefix pre
Start search where W = Start,
pre = empty, and c = 0
Searched Start to W and have
not found distinguishing edge
Query
IsRangeEmpty(old_spectrum, c
+ v i, c + v i +
num_paths_from(Wi) - 1)

If true, prefix is pre||(WWi) and prefix
value = c + vi
Implementation and Results




Prototype system called DYNADIFF
Instruments executable files
User interfaces for displaying and organizing path
spectra
Tested on Unix cal and ncftp
 Correctly identified path
for handling leap years in cal
 Introduced two-digit Y2K problem into ncftp and were able
to identify different pre- and post-2000 behavior
Implementation and Results
Other Applications of Path
Profiling




Testing
Systems that warn of internal errors
Regression testing
Testing for inconsistent data
Download