performance - Martine Ceberio

advertisement
Introduction to main concepts of algorithms:
Analysis: how “long” does my algorithm take to run?
Two main dimensions in performance analysis are time and space. We will mostly focus on time,
although we will make some space-related comments when appropriate.
How to analyze an algorithm? It requires that we have access to a model of the implementation. For
instance, in the case of a black-box or black-box-based algorithm, it is challenging to get a sense of the
time complexity. [we might get a reasonable idea but it will not take into account best and worst-case
scenarios]
Let’s assume we have access to the following algorithm.
Initialization of j
For j from 2 to n
Key = A[j]
i = j-1
While i > 0 and A[i] >
Key
A[i+1]=A[i]
i = i-1
A[i+1] = Key
At each iteration of the loop:
- Check if j <= n
- Assign A[j] to Key
- Assign j-1 to i
- For each while iteration:
o Check i > 0
o Check A[i] > Key
o Assign A[i] to A[i+1]
o Decrement i
-
Assign Key to A[i+1]
Increment j
1
At each iteration of the loop:
- 1
- 1
- 1

1
1
1
1
x max number of while = i = j-1
+ last check of both conditions = 2
-
1
1
= [3 + {4.(j-1)+2} + 2] at each iteration
+ last check of condition j <=n that fails =
1
TOTAL =
1 + sum over j [2 to n] [3 + {4.(j-1)+2} + 2]
+1
= 2 + sum over j [2 to n] [3 + {4.(j-1)+2} +2]
= 2 + sum over j [2 to n] [7 + 4.(j-1)]
= 2 + sum over j [2 to n] [7] + 4.sum over j
[1 to n-1] [j]
= 2 + 7.(n-1) + 4.n.(n-1)/2
Warning! We assume that simple steps cost 1 (to make the reading easier). To be more precise, we
should assign each of them a specific constant: c1, c2, c3, etc.
The above time complexity would then be considered quadratic because of its term 4.n.(n-1)/2. In earlier
classes, you used to write the complexity of this algorithm as big-Oh of n^2: i.e., stating that the
behavior of n^2 constitutes an upper bound of the behavior of the algorithm. We will see more
notations, allowing us to state more detailed and different information. For now, let’s keep with big-Oh.
So big-Oh is a common way to identify the behavior of an algorithm: an upper bound that we aim at finding tight
since the lower the better.
Now, other concerns have to be taken into account:
 Big-oh behavior versus size of the input
 Best-case behavior
 Worst-case behavior
 Average-case behavior
1. Big-Oh behavior versus size of the input.
Let’s take a look at two algorithms’ behaviors.
Algorithm 1 is big-Oh of n^2 + 100000.
It is not because an algorithm is n^p, with a large p, that the algorithm is not good. Sometimes, the
constant factors, ignored in the notations, make a difference in the actual performance for reasonably
small size of the input.
2. Worst-case behavior.
We need actual proven data that tells how slow the program can run in the worst case, and not just
assumptions/hypotheses that hopefully it would not run in more than X seconds, Y Mb, etc.
+ in many cases, the average behavior (the most expected one) is very close to worst case.
In many applications, the worst-case will also happen often: e.g., looking for an item that does not exist.
3. Average-case behavior.
This gives an idea of roughly what to expect when we don’t know anything specific about the input.
The average time complexity is defined as:
The sum of all running times (for all possible inputs) / the number of possible inputs
It can be hard to achieve.
Note: in many cases of searches that are NP-hard, a traditional analysis will conclude that the algorithm
runs in exp time. That is the average / expected time. However, such algorithms include heuristics (for
now: ways to speed up the running time) that do not appear in the analysis but really impact the actual
running time. In such cases, running experiments is the only way to show the performance of the
algorithm.
4. Best-case behavior.
Although it is not critical to know it: it might not happen often, it is not what users are interested in; it
can be worth taking a look at it because it can shed light on possible improvements.
E.g., if for a very specific problem, you don’t achieve best case, there might be room to improve by
designing a heuristic or short cut for the program.
EXERCISES: all four from page 29 of Introduction to Algorithms.
Download