Algorithm Analysis

advertisement
Algorithm Analysis
CS 201 Fundamental Structures
of Computer Science
Introduction
• Once an algorithm is given for a problem
and decided to be correct, an important
step is to determine how much in the way
of resources (such as time or space) the
algorithm will require.
• In this chapter, we shall discuss
– How to estimate the time required for a
program
Mathematical background
• The idea of the following definitions is to
establish a relative order among functions.
• Given two functions, there are usually
points where one function is smaller than
the other function.
– It does not make sense to claim f(N) < g(N)
– Let us compare f(N) = 1000N and g(N) = N2
– Thus, we will compare their relative rates of
growth
Mathematical background
• Definition 1:
T ( N )  O( f ( N )) if there are positive constants c and n0
such that T ( N )  c  f ( N ) when N  n0
T ( N )  O( f ( N )) says that the growth rate of T ( N ) is
less than or equal to that of f ( N )
 T ( N ) grows at a rate no faster than f ( N )
 Thus f ( N ) is an upper bound on T ( N )
Mathematical background
• Definition 2:
T ( N )  ( g ( N )) if there are positive constants c and n0
such that T ( N )  c  g ( N ) when N  n0
T ( N )  ( g ( N )) says that the growth rate of T ( N ) is
greater than or equal to that of g ( N )
 T ( N ) grows at a rate no slower than g ( N )
 Thus g ( N ) is a lower bound on T ( N )
Mathematical background
• Definition 3:
T ( N )  (h( N )) if and only if
T ( N )  O(h( N )) and T ( N )  (h( N ))
T ( N )  (h( N )) says that the growth rate of T ( N )
equals to that of h( N )
Examples
• f(N)=N3 grows faster than g(N)=N2, so
g(N) = O(f(N)) or f(N) = (g(N))
• f(N)=N2 and g(N)=2N2 grow at the same rate, so
f(N) = O(g(N)) and f(N) = (g(N))
• If g(N)=2N2, g(N)=O(N4), g(N)=O(N3),
g(N)=O(N2) are all technically correct, but the
last one is the best answer.
• Do not say T(N)=O(2N2) or T(N)=O(N2+N). The
correct form is T(N)=O(N2).
Mathematical background
• Rule 1: If T1 ( N )  O( f ( N )) and T2 ( N )  O( g ( N )), then
(a) T1 ( N )  T2 ( N )  O( f ( N )  g ( N ))
(b) T1 ( N )  T2 ( N )  O( f ( N )  g ( N ))
• Rule 2: If T ( N ) is a polynomial of degree k , then
T ( N )  ( N k )
• Rule 3: log k N  O( N ) for any constant k .
This tells us that logarithms grow very slowly.
Examples
• Determine which of f(N)=N logN and g(N)=N1.5
grows faster.
 Determine which of logN and N0.5 grows faster.
 Determine which of log2N and N grows faster.
 Since N grows faster than any power of a log,
g(N) grows faster than f(N).
Examples
• Consider the problem of downloading a file over
the Internet.
– Setting up the connection: 3 seconds
– Download speed: 1.5 Kbytes/second
• If a file is N kilobytes, the time to download is
T(N) = N/1.5 + 3, i.e., T(N) = O(N).
– 1500K file takes 1003 seconds
– 750K file takes 503 seconds
• If the connection speed doubles, both times
decrease, but downloading 1500K still takes
approximately twice the time downloading 750K.
Typical growth rates
Function
c
log N
log2 N
N
N log N
N2
N3
2N
Name
Constant
Logarithmic
Log-squared
Linear
Quadratic
Cubic
Exponential
Plots of various algorithms
Algorithm analysis
• The most important resource to analyze is
generally the running time
– We assume that simple instructions (such as addition,
comparison, and assignment) take exactly one unit time
• Unlike the case with real computers
– For example, I/O operations take more time compared to
comparison and arithmetic operators
– Obviously, we do not have this assumption for fancy
operations such as matrix inversion, list insertion, and
sorting.
– We assume infinite memory
– We do not include the time required to read the input
Algorithm analysis
• Typically, the size of the input is the main
consideration
– Worst-case performance represents a guarantee for
performance on any possible input
– Average-case performance often reflects typical behavior
– Best-case performance is often of little interest
• Generally, it is focused on the worst-case analysis
– It provides a bound for all inputs
– Average-case bounds are much more difficult to compute
Algorithm analysis
• Although using Big-Theta would be more
precise, Big-Oh answers are typically given.
• Big-Oh answer is not affected by the
programming language
– If a program is running much more slowly than the
algorithm analysis suggests, there may be an
implementation inefficiency
Algorithm analysis
• If two programs are expected to take similar
times, probably the best way to decide which is
faster to code them both up and run them!
• On the other hand, we would like to eliminate the
bad algorithmic ideas early by algorithm analysis
– Although different instructions can take different
amounts of time (these would correspond to
constants in Big-Oh notation), we ignore this
difference in our analysis and try to find the upper
bound of the algorithm
A simple example
• Estimate the running time of the following
function
int sum(int n){
int partialSum = 0;
for (int i = 0; i <= n; i++)
partialSum += i * i * i;
return partialSum;
}
General rules
• Rule 1 – for loops: The running time of a for loop is at
most the running time of the statements inside the for
loop (including tests) times the number of iterations
• Rule 2 – nested loops: Analyze these inside out. The
total running time of a statement inside a group of nested
loops is the running time of the statement multiplied by
the product of the sizes of all the loops
for (i = 0; i < n; i++)
for (j = i; j < n; j++)
k++;
General rules
• Rule 3 – consecutive statements: just add.
for (i = 0; i < n; i++)
a[i] = 0;
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
a[i] += a[j] + i + j;
• Rule 4 – if/else: For the following, the running time is
never more than that of the test plus the larger of that of
S1 and S2
if (condition)
S1
else
S2
General rules
• If there are function calls, these must be
analyzed first
• If there are recursive functions, be careful about
their analyses. For some recursions, the
analysis is trivial
long factorial(int n){
if (n <= 1)
return 1;
return n * factorial(n-1);
}
General rules
• Let us analyze the following recursion
long fib(int n){
if (n <= 1)
return 1;
return fib(n-1) + fib(n-2);
}
• T ( N )  T ( N  1)  T ( N  2)  2
• By induction, it is possible to show that
fib( N )  (5 / 3) N and fib( N )  (3 / 2) N
• So the running time grows exponentially. By using a for
loop, the running time can be reduced substantially
Sequential search
index = -1;
for (i = 0; i < N; i++)
if (A[i] == value){
index = i;
break;
}
• Worst-case: It is the last element or it is not found
– N iterations  O(N)
• Best-case: It is the first element
– 1 iteration  O(1)
• Average case: We may have N different cases
– (N + 1) / 2 iterations  O(N) (successful searches)
– N iterations  O(N) (unsuccessful searches)
Searching a value in a sorted array
using binary search
int binarySearch(int *A, int N, int value){
int low = 0, high = N;
while (low <= high){
int mid = (low + high) / 2;
if (a[mid] < value)
low = mid + 1;
else if (a[mid] > value)
high = mid - 1;
else
return mid;
}
return -1;
}
Iterative solution of raising an
integer to a power
long pow1(long x, long n){
long result = 1;
for (int i = 1; i <= n; i++)
result = result * x;
return result;
}
Recursive solution of raising an
integer to a power
long pow2(long x, long n){
if (n == 0)
return 1;
if (n == 1)
return x;
if (n % 2 == 0)
return pow2(x*x,n/2);
else
return pow2(x*x,n/2) * x;
}
Download