Data Structures and Algorithms

advertisement
Performance Measurement
CSE, POSTECH
Program Performance

Recall that the program performance is the
amount of computer memory and time needed to
run a program.
1.
2.

The performance of a program depends on
–
–
2
Analytically - performance analysis
Experimentally - performance measurement
the number and type of operations performed, and
the memory access pattern for the data and
instructions
Performance Analysis

Paper and pencil.

Do NOT need a working computer program or
even a computer.
3
Some Uses of Performance Analysis
Why do want to do a performance analysis of
algorithms?
 To determine the practicality of algorithm
 To predict run time on large instance
 To compare two algorithms that have different
asymptotic complexity
- e.g., O(n) and O(n2)
4
Limitations of Performance Analysis
Does NOT account for constant factors.
 But constant factors may dominate
1000n vs. n2
especially if we are interested only in n < 1000


5
Modern computers have a hierarchical memory
organization with different access times for
memory at different levels of the hierarchy.
Memory Hierarchy
MAIN
L2
ALU
R
L1
8-32
32KB
512KB
512MB
1C
2C
10C
100C
C = CPU cycle
 Read Sections 4.5.1 & 4.5.2

Limitations of Performance Analysis

Performance analysis does not account for this
difference in memory access times.

Programs that do more work may take less time
than those that do less work.
–
7
e.g., a program with a large operation count and a
small number of accesses to slow memory may take
less time than a program with a small operation count
and a large number of accesses to slow memory
Performance Measurement


Concerned with obtaining the actual space and
time requirements of a program
Actual space and time are dependent on
–
–

8
Compiler and options
Specific computer
We do not generally consider run-time space
requirements (read the reasons on page 122)
Performance Measurement Needs (1)




9
programming language
working program
computer
compiler and options to use g++ –O, –O2, -O3
(see manual pages for g++)
Performance Measurement Needs (2)

data to use for measurement
1.
2.
3.


10
worst-case data
best-case data
average-case data
What is the worst-case, best-case, average-case data
for insertionSort and how do you generate them?
timing mechanism --- clock
Choosing Instance Size

We decide on which values of instance size (n)
to use according to two factors:
1.
2.

11
the amount of time we want to perform
what we expect to do with the times
In practice, we generally need the times for more
than three values of n (read the reasons on page
123)
Timing in C++
double clocksPerMillis = double(CLOCKS_PER_SEC) / 1000;
// clock ticks per millisecond
clock_t startTime = clock();
// code to be timed comes here
double elapsedMillis = (clock() – startTime) / clocksPerMillis;
// elapsed time in milliseconds
12
Shortcoming

See Program 4.1 and its execution times in Figure
4.1 (what is wrong with these execution times?)
 the time needed for the worst case sorts is too small for
clock() to measure

Clock accuracy
–
–
–
13
assume the clock is accurate to within 100 ticks
If the method returns the time of t, the actual time lies
between max{0,t-100} and t+100
For Figure 4.1, the actual time could be between 0-100
Shortcoming





14
Repeat work many times to bring total time to
be >= 1000 ticks
See Program 4.2
What is the difference between Prog 4.1 & 4.2?
See Figures 4.2 & 4.3
See Figure 4.4 (overhead measurement)
Accurate Timing
clock_t startTime = clock();
long numberofRepetitions;
do {
numberofRepetitions++;
doSomething();
} while (clock() - startTime < 1000)
double elapsedMillis = (clock()- startTime) / clocksPerMillis;
double timeForCode = elapsedMillis/numberofRepetitions;
15
Accuracy

Now accuracy is 10%.

First reading may be just about to change to startTime + 100

Second reading (final value of clock()) may have just
changed to finishTime

so finishTime - startTime is off by 100 ticks
16
Accuracy

First reading may have just changed to startTime

Second reading may be about to change to finishTime
+ 100

so finishTime - startTime is off by 100 ticks
17
Accuracy

Examining remaining cases, we get
trueElapsedTime = finishTime - startTime +- 100 ticks

To ensure 10% accuracy, require
elapsedTime = finishTime – startTime >= 1000 ticks
18
What is wrong with the following measurement?
long numberOfRepetitions = 0; // Program 4.3
clock_t elapsedTime = 0;
do
{
numberOfRepetitions++;
clock_t startTime = clock( );
doSomething();
elapsedTime += clock( ) - startTime;
} while (elapsedTime < 1000); // repeat until enough time has elapsed
19
Answer to Ch. 4, Exercise 1
In each iteration of the do-while loop, the amount added to
elapsedTime may deviate from the actual run time of
doSomething by up to 100 ms (or 100 ticks). This error is
additive over the iterations and so does not decline as a
fraction of total time.
For example, suppose that doSomething takes almost 100
ms. to execute. In the worst case, the clock reading will
change just before each execution of the assignment
startTime = clock() and the amount added to elapsedTime
is zero on each iteration of the do-while loop; the do-while
loop does not terminate.

20
How do we fix this?
Time Measurement in Time Shared Systems

UNIX
–
–
time MyProgram
See man pages for time

Do Exercise 4.2

Read Chapter 4
21
Download