Slides 2

advertisement
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Chapter 2:
Algorithm Analysis
Big-Oh and Other Notations
in Algorithm Analysis
• Classifying Functions by Their Asymptotic
Growth
• Theta, Little oh, Little omega
• Big Oh, Big Omega
• Rules to manipulate Big-Oh expressions
• Typical Growth Rates
Classifying Functions by Their
Asymptotic Growth
Asymptotic growth : The rate of growth of a
function
Given a particular differentiable function f(n), all
other differentiable functions fall into three
classes:
.growing with the same rate
.growing faster
.growing slower
Theta
f(n) and g(n) have
same rate of growth, if
lim( f(n) / g(n) ) = c,
0 < c < ∞, n -> ∞
Notation: f(n) = Θ( g(n) )
pronounced "theta"
Little oh
f(n) grows slower than g(n)
(or g(n) grows faster than f(n))
if
lim( f(n) / g(n) ) = 0, n → ∞
Notation: f(n) = o( g(n) )
pronounced "little oh"
Little omega
f(n) grows faster than g(n)
(or g(n) grows slower than f(n))
if
lim( f(n) / g(n) ) = ∞, n -> ∞
Notation: f(n) = ω (g(n))
pronounced "little omega"
Little omega and Little oh
if
g(n) = o( f(n) )
then f(n) = ω( g(n) )
Examples: Compare n and n2
lim( n/n2 ) = 0, n → ∞, n = o(n2)
lim( n2/n ) = ∞, n → ∞, n2 = ω(n)
Theta: Relation of Equivalence
R: "having the same rate of growth":
relation of equivalence,
gives a partition over the set of all
differentiable functions - classes of
equivalence.
Functions in one and the same class are
equivalent with respect to their growth.
Algorithms with Same Complexity
Two algorithms have same complexity,
if the functions representing the number
of operations have
same rate of growth.
Among all functions with same rate of
growth we choose the simplest one to
represent the complexity.
Examples
Compare n and (n+1)/2
lim( n / ((n+1)/2 )) = 2,
same rate of growth
(n+1)/2 = Θ(n)
- rate of growth of a linear function
Examples
Compare n2 and n2+ 6n
lim( n2 / (n2+ 6n ) )= 1
same rate of growth.
n2+6n = Θ(n2)
rate of growth of a quadratic function
Examples
Compare log n and log n2
lim( log n / log n2 ) = 1/2
same rate of growth.
log n2 = Θ(log n)
logarithmic rate of growth
Examples
Θ(n3):
n3
5n3+ 4n
105n3+ 4n2 + 6n
Θ(n2):
n2
5n2+ 4n + 6
n2 + 5
Θ(log n): log n
log n2
log (n + n3)
Comparing Functions
• same rate of growth: g(n) = Θ(f(n))
• different rate of growth:
either g(n) = o (f(n))
g(n) grows slower than f(n),
and hence f(n) = ω(g(n))
or
g(n) = ω (f(n))
g(n) grows faster than f(n),
and hence f(n) = o(g(n))
The Big-Oh Notation
f(n) = O(g(n))
if f(n) grows with
same rate or slower than g(n).
f(n) = Θ(g(n)) or
f(n) = o(g(n))
Example
n+5 = Θ(n) = O(n) = O(n2)
= O(n3) = O(n5)
the closest estimation: n+5 = Θ(n)
the general practice is to use
the Big-Oh notation:
n+5 = O(n)
The Big-Omega Notation
The inverse of Big-Oh is Ω
If
then
g(n)
f(n)
= O(f(n)),
= Ω (g(n))
f(n) grows faster or with the same
rate as g(n):
f(n) = Ω (g(n))
Rules to manipulate
Big-Oh expressions
Rule 1:
a. If
T1(N) = O(f(N))
and
T2(N) = O(g(N))
then
T1(N) + T2(N) =
max( O( f (N) ), O( g(N) ) )
Rules to manipulate
Big-Oh expressions
b. If
and
then
T1(N) = O( f(N) )
T2(N) = O( g(N) )
T1(N) * T2(N) = O( f(N)* g(N) )
Rules to manipulate
Big-Oh expressions
Rule 2:
If T(N) is a polynomial of degree k,
then
T(N) = Θ( Nk )
Rule 3:
log k N = O(N) for any constant k.
Examples
n2 + n = O(n2)
we disregard any lower-order term
nlog(n) = O(nlog(n))
n2 + nlog(n) = O(n2)
Typical Growth Rates
C
logN
log2N
N
NlogN
N2
N3
2N
N!
constant, we write O(1)
logarithmic
log-squared
linear
quadratic
cubic
exponential
factorial
Exercise
N2
2N
N
N2
2N
N
True or False
= O(N2)
= O(N2)
= O(N2)
= O(N)
= O(N)
= O(N)
Exercise
True or False
N2 = Θ (N2)
2N = Θ (N2)
N = Θ (N2)
N2 = Θ (N)
2N = Θ (N)
N = Θ (N)
Running Time Calculations
The work done by an algorithm,
i.e. its complexity, is determined by
the
number of the basic operations
necessary to solve the problem.
25
The Task
Determine how the number of
operations depend on the size of
input :
N - size of input
F(N) - number of operations
26
Basic operations in an algorithm
Problem: Find x in an array
Operation: Comparison of x with an
entry in the array
Size of input: The number of the
elements in the array
27
Basic operations ….
Problem: Multiplying two matrices
with real entries
Operation:
Multiplication of two real numbers
Size of input:
The dimensions of the matrices
28
Basic operations ….
Problem: Sort an array of numbers
Operation: Comparison of two
array entries plus moving elements in
the array
Size of input: The number of
elements in the array
29
Counting the number of operations
A. for loops O(n)
The running time of a for loop is at
most the running time of the
statements inside the loop times
the number of iterations.
30
for loops
sum = 0;
for( i = 0; i < n; i++ )
sum = sum + i;
The running time is O(n)
31
Counting the number of operations
B. Nested loops
The total running time is the
running time of the inside statements
times
the product of the sizes of all the loops
32
Nested loops
sum = 0;
for( i = 0; i < n; i++)
for( j = 0; j < n; j++)
sum++;
The running time is O(n2)
33
Counting the number of operations
C. Consecutive program fragments
Total running time :
the maximum of the running time
of the individual fragments
34
Consecutive program fragments
sum = 0;
for( i = 0; i < n; i++)
sum = sum + i;
O(n)
sum = 0;
O(n2)
for( i = 0; i < n; i++)
for( j = 0; j < 2n; j++) sum++;
The maximum is O(n2)
35
Counting the number of operations
D: If statement
if C
else
S1;
S2;
The running time is the maximum
of the running times of S1 and S2.
36
EXAMPLES
what is the number of operations?
sum = 0;
for( i = 0 ; i < n; i++)
for( j = 0 ; j < n*n ; j++ )
sum++;
37
EXAMPLES
what is the number of operations?
sum = 0;
for( i = 0; i < n ; i++)
for( j = 0; j < i ; j++)
sum++;
38
EXAMPLES
what is the number of operations?
for(j = 0; j < n*n; j++)
compute_val(j);
The complexity of
compute_val(x) is given to
be O(n*logn)
39
Search in an unordered array
of elements
for (i = 0; i < n; i++)
if (a[ i ] == x) return 1;
return -1;
40
Search in a table n x m
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
if (a[ i ][ j ] == x) return 1 ;
return -1;
41
Max Subsequence Problem
•
•
Given a sequence of integers A1, A2, …, An, find the maximum
possible value of a subsequence Ai, …, Aj.
Numbers can be negative.
You want a contiguous chunk with largest sum.
•
•
Example: -2, 11, -4, 13, -5, -2
The answer is 20 (subseq. A2 through A4).
•
We will discuss 4 different algorithms, with time complexities O(n3),
O(n2), O(n log n), and O(n).
With n = 106, algorithm 1 may take > 10 years; algorithm 4 will take a
fraction of a second!
•
•
42
Algorithm 1 for Max Subsequence Sum
• Given A1,…,An , find the maximum value of Ai+Ai+1+···+Aj
0 if the max value is negative
int maxSum = 0;
O (1)
for( int i = 0; i < a.size( ); i++ )
for( int j = i; j < a.size( ); j++ )
{
O (1)
int thisSum = 0;
for( int k = i; k <= j; k++ )
O (1)
thisSum += a[ k ];
if( thisSum > maxSum )
O (1)
maxSum = thisSum;
}
return maxSum;
 Time
43
complexity: O(n3)
O( j  i)
n 1
n 1 n 1
j i
i  0 j i
O( ( j  i)) O( ( j  i))
Algorithm 2
• Idea: Given sum from i to j-1, we can compute the sum
from i to j in constant time.
• This eliminates one nested loop, and reduces
the running time to O(n2).
into maxSum = 0;
for( int i = 0; i < a.size( ); i++ )
int thisSum = 0;
for( int j = i; j < a.size( ); j++ )
{
thisSum += a[ j ];
if( thisSum > maxSum )
maxSum = thisSum;
}
return maxSum;
44
Algorithm 3
• This algorithm uses divide-and-conquer paradigm.
• Suppose we split the input sequence at midpoint.
• The max subsequence is entirely in the left half,
entirely in the right half, or it straddles the midpoint.
• Example:
left half
|
right half
4 -3 5 -2 | -1 2 6 -2
• Max in left is 6 (A1 through A3); max in right is 8
(A6 through A7). But straddling max is 11 (A1 thru
A7).
45
Algorithm 3 (cont.)
• Example:
left half
|
right half
4 -3 5 -2 | -1 2 6 -2
• Max subsequences in each half found by recursion.
• How do we find the straddling max subsequence?
• Key Observation:
– Left half of the straddling sequence is the max
subsequence ending with -2.
– Right half is the max subsequence beginning with -1.
• A linear scan lets us compute these in O(n) time.
46
Algorithm 3: Analysis
• The divide and conquer is best analyzed
through recurrence:
T(1) = 1
T(n) = 2T(n/2) + O(n)
• This recurrence solves to T(n) = O(n log n).
47
Algorithm 4
2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2
int maxSum = 0, thisSum = 0;
for( int j = 0; j < a.size( ); j++ )
{
thisSum += a[ j ];
if ( thisSum > maxSum )
maxSum = thisSum;
else if ( thisSum < 0 )
thisSum = 0;
}
}
return maxSum;
• Time complexity clearly O(n)
• But why does it work? I.e. proof of
correctness.
48
Proof of Correctness
• Max subsequence cannot start or end at a negative Ai.
• More generally, the max subsequence cannot have a
prefix with a negative sum.
Ex: -2 11 -4 13 -5 -2
• Thus, if we ever find that Ai through Aj sums to < 0,
then we can advance i to j+1
– Proof. Suppose j is the first index after i when the
sum becomes < 0
– The max subsequence cannot start at any p
between i and j. Because Ai through Ap-1 is positive,
so starting at i would have been even better.
49
Algorithm 4
int maxSum = 0, thisSum = 0;
for( int j = 0; j < a.size( ); j++ )
{
thisSum += a[ j ];
if ( thisSum > maxSum )
maxSum = thisSum;
else if ( thisSum < 0 )
thisSum = 0;
}
return maxSum
• The algorithm resets whenever prefix is <
0. Otherwise, it forms new sums and
updates maxSum in one pass.
50
Why Efficient Algorithms Matter
• Suppose N = 106
• A PC can read/process N records in 1 sec.
• But if some algorithm does N*N computation, then it
takes 1M seconds = 11 days!!!
• 100 City Traveling Salesman Problem.
– A supercomputer checking 100 billion tours/sec still
requires 10100 years!
•
51
Fast factoring algorithms can break encryption
schemes. Algorithms research determines what is safe
code length. (> 100 digits)
How to Measure Algorithm Performance
• What metric should be used to judge
algorithms?
– Length of the program (lines of code)
– Ease of programming (bugs, maintenance)
– Memory required
Running time
• Running time is the dominant standard.
– Quantifiable and easy to compare
– Often the critical bottleneck
52
Logarithms in Running Time
•
•
•
•
Binary search
Euclid’s algorithm
Exponentials
Rules to count operations
53
Divide-and-conquer algorithms
Subsequently reducing the
problem by a factor of two
require O(logN) operations
54
Why logN?
A complete binary tree with
N leaves has logN levels.
Each level in the divide-and- conquer
algorithms corresponds to an operation
Hence the number of operations is
O(logN)
55
Example: 8 leaves, 3 levels
56
Binary Search
Solution 1:
Scan all elements from left to right,
each time comparing with X.
O(N) operations.
57
Binary Search
Solution 2: O(logN)
Find the middle element Amid in the
list and compare it with X
If they are equal, stop
If X < Amid consider the left part
If X > Amid consider the right part
Do until the list is reduced to one
element
58
Euclid's algorithm
Finding the greatest common
divisor (GCD)
GCD of M and N, M > N,
=
GCD of N and M % N
59
GCD and recursion
Recursion:
If M%N = 0 return N
Else return GCD(N, M%N)
The answer is the last nonzero
remainder.
60
M
24
N
15
rem
9
15
9
6
9
6
3
3
0
6
3
0
61
long gcd ( long m, long n)
{
long rem;
while (n != 0)
{
rem = m % n;
m = n;
n = rem;
}
Euclid’s
Algorithm
(non-recursive
implementation)
return m; }
62
Why O(logN)
M % N <= M / 2
After 1st iteration N appears as first
argument, the remainder is less than N/2
After 2nd iteration the remainder appears
as first argument and will be reduced by a
factor of two
Hence
O(logN)
63
Computing XN
N
X
N
X
=
2
N
/
2
X*(X )
,N is odd
=
2
N
/
2
(X )
,N is even
64
long pow (long x, int n)
{
if ( n == 0)
return 1;
if (is_Even( n ))
return pow(x * x, n/2);
else return
}
x * pow ( x * x, n/2);
65
Why O(LogN)
If N is odd : two multiplications
The operations are at most 2logN:
O(logN)
66
Another recursion for XN
Another recursive definition that
reduces the power just by 1:
XN = X*XN -1
Here the operations are N-1, i.e. O(N)
and the algorithm is less efficient
than the divide-and-conquer
algorithm.
67
How to count operations
• single statements (not function calls) :
constant O(1) = 1.
• sequential fragments: the maximum of the
operations of each fragment
68
How to count operations
• single loop running up to N, with single
statements in its body: O(N)
• single loop running up to N,
with the number of operations in the body
O(f(N)):
O( N * f(N) )
69
How to count operations
• two nested loops each running up to N, with
single statements: O(N2)
• divide-and-conquer algorithms with input
size N: O(logN)
Or O(N*logN) if each step requires additional
processing of N elements
70
Example: What is the probability two numbers to
be relatively prime?
tot = 0; rel = 0;
for ( i = 0;
i <= n; i++)
for (j = i+1; j <= n; j++)
{ tot++;
if ( gcd( i, j ) ==1) rel++; }
return (rel/tot);
Running time = ?
71
Download