Analysis Of Algorithms

advertisement
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Analysis of Algorithms
The field of computer science that studies efficiency of algorithms is known as analysis
of algorithms.
Example 1:
Bubblesort requires n2 statements to sort n elements for a given time interval
If computing speed is increased by 1,000,000, then 1,000,000n2 statements can be
executed in the same time interval.
But, 1,000,000n2 = (1000n)2
i.e. 1000n elements can be sorted which is an
increase (we lost 3 of 6 orders of
magnitude)
Example 2:
Swap down in heapsort requires lg(n) statements for n elements for a given time interval
If computing speed increased by 16, then 16lg(n) statements can be executed in the same
time interval.
But, 16lgn = lg(n16) [since lg m n  n lg m ]
i.e. n16 elements can be processed, which is an exponential increase
Algorithms for which problem size, which can be solved in the same time interval
increase linearly with computing speed are called linear complexity algorithms.
This information has been summarized in the table below:
Table 1: Effect of algorithm efficiency on ability to benefit from increases in speed
Algorithm
Complexity
Increase in
computing speed
Equivalent problem
size that can be
solved in the same
amount of time
Benefit from
increases in speed
Linear i.e. n
Quadratic i.e. n2
Logarithmic i.e. lg(n)
1,000,000 (106)
1,000,000 (106)
1,000,000 (106)
1,000,000n (106)
1000n (103)
Linear
106
n (to the power of
106)
Exponential
1
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Complexity of Algorithm Depends On Algorithm Implementation:
Example 1: Bubblesort
Code:
for i:= 1 to n-1
-----line 1
for j:= 1 to n-1
-----line 2
if X[j] > X[j+1]
-----line 3
temp := X[j]
-----line 4
X[j] := X[j+1]
-----line 5
X[j+1] := temp
-----line 6
T(n)
=
=
=
=
Statement #
# of executions
1
2
3
4, 5, 6
n
(n-1)n
(n-1)(n-1)
n( n 1)
2
3n 2  3n
n + (n – n) + (n – 2n + 1) +
2
2
3
n

3
n
2n2 – 2n + 1 +
2
2
4n  4n  2 3n 2  3n
+
2
2
7 2 7
7n 2  7n  2
n - n+1
=
2
2
2
2
Modify line 2 of code to:
2
j:=1 to n – i
{this eliminates unnecessary comparisons}
Statement #
# of executions
1
2
n
3, 4, 5, 6
n
n(n  1)
1  i
2
i 2
n(n  1) n 1
 i
2
i 1
n2  n
n
 1  2(n 2  n)
T(n) =
2
n2  n
(2n 2  n  2n  1) 
=
2
4n 2  2n  2 n 2  n
=
+
2
2
5 2 1
5n 2  n  2
n - n -1
=
=
2
2
2
Thus, we see as algorithm implementation will vary, T(n) will vary.
2
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Some Definitions:
Characteristic Operation: Statement performed repeatedly; used for finding time
complexity (T(n)).
Time Complexity:
Number of characteristic operation performed
A characteristic operation and its corresponding time complexity function are called
realistic if the execution time with respect to some implementation is bounded by a linear
function of T(n).
Example: Choosing characteristic operation of Bubblesort
Statement # chosen as characteristic operation
Time complexity
1
2
3
4, 5, 6
n
(n-1)n
(n-1)(n-1)
0 to (n-1)(n-1)
depending on condition
We’ve seen that for bubblesort, T (n) O(n 2 )
From the table above, time complexity is linearly bounded by T(n) if characteristic
operation is chosen to be either statement # 2 or statement # 3. But statement # 2 (for j:=
1 to n-1) is a looping construct that does not have any inherent connection to the meaning
of the algorithm. Statement # 3 (if X[j] > X[j+1]), on the other hand is the core of the
algorithm and hence a good choice for characteristic operation of the algorithm/ this
algorithm implementation
Another variation of bubblesort code:
Code:
pass := 1
swap := true
while swap and (pass < n-1)
swap := false
for j := 1 to (n – pass)
if X[j] > X[j + 1]
temp := X[j]
X[j] := X[j+1]
X[j+1] := temp
swap := true
-----line 1
-----line 2
-----line 3
-----line 4
-----line 5
-----line 6
-----line 7
-----line 8
-----line 9
-----line 10
Here, number of passes varies from 1 to (n – 1) depending on contents of array.
3
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Best, Worst and Average Time Complexities
Algorithm P accepts ‘k’ different instances of size ‘n’
Let
Ti(n) is time complexity of P given ith instance (1 ≤ i ≤ k)
Pi is probability that this instance occurs
Then,
Best-case Complexity,
B(n) = Min T (n)
1i  k
i
Worst-case Complexity,
W(n) = Max T (n)
1i  k
i
Average-case Complexity,
A(n) =
k
 pi Ti (n)
i 1
Example: Linear Search
Code:
LinearSearch (var R: RA type; a, b, x: integer): boolean
var i integer; found: boolean;
i := a;
found := false
while (i ≤ b ) and not found
found := R[i] = = x
i++
characteristic operation
LinearSearch := found
Algorithm has infinite instances, but all can be categorized into (n+1) classes:
x = R[1], x = R[2], x = R[3], … x = R[n], x  R
i
1
2
i
n
n+1
Instance
x = R[1]
x = R[2]
x = R[i]
x = R[n]
x  R[1… n]
pi
p/n
p/n
p/n
p/n
1–p
Ti(n)
1
2
i
n
n
Here,
B(n) = 1
W(n) = n (occurs in two cases)
4
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
n 1
A(n)
=
 p T ( n)
i i
i 1
n
=
 p T ( n)  p
=
T ( n)
n 1 n 1
i i
i 1
n
p
 n i  (1  p)n
i 1
=
=
=
n
i
p  (1  p)n
i 1 n
p (n)(n  1)
 (1  p)n
n
2
( p)(n  1)
 (1  p)n
2
Variation: Linear Search with Ordered Array (early stop possible)
Here,
i
1
2
i
n
n+1
n+2
j
2n-1
2n
Instance
x = A[1]
x = A[2]
x = A[i]
x = A[n]
x < A[1]
x < A[2]
x < A[j]
x < A[2n-1]
x < A[n]
pi
p/n
p/n
p/n
p/n
(1 – p)/n
(1 – p)/n
(1 – p)/n
(1 – p)/n
(1 – p)/n
Ti(n)
1
2
i
n
1
2
j
n-1
n
success
failure
For linear search with ordered array,
2n
A(n)
=
 p T ( n)
i i
i 1
n
=
 piTi (n) 
i 1
=
=
=
2n
 piTi (n)
i  n 1
p (n)(n  1)
1  p 2n
[
](
)  (i  n)
n
2
n i i 1
p (n)(n  1) (1  p) n
[
]
i
n
2
n i 1
( p)(n  1) (1  p)(n  1)

2
2
2n
p
(1  p)
i

(i  n)


n
i 1 n
i  n 1
n
=
=
( n  1)
2
5
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Q. When is early stop better than no early stop in unordered array?
The above occurs when
or,
or,
or,
or,
or,
or,
or,
A UNORDERED (n)
( p)(n  1)
 (1  p)n
2
( p)(n  1)  2 n(1  p)
2
pn  p  2 n  2 pn
2
p  2n  pn
p  pn
p(1 n)
p
>
>
>
>
A EARLY STOP (n)
( n  1)
2
( n  1)
2
( n  1)
2
n 1
1 n
1 n
1
(we assume n > > 1  sign changes by division)
>
>
>
<
This shows that with our assumption that whenever Pr(element in array) < 1.0
EARLY STOP is better.
6
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Analysis of Recursive Algorithms
Let T(n) ≡ time complexity of (recursive) algorithm
Time complexity can be defined recursively.
Example 1: Factorial Function
Code: int factorial (int n) {
if (n = = 0) return 1;
else return n * factorial (n – 1);
}
Recurrence Equation:
T(n) = T (n – 1) + 1; n > 0 {all n > 0}
T(0) = 1
; n = 0 {base case}
Thus,
T(n – 1) = T(n – 2) + 1
T(n – 2) = T(n – 3) + 1
T(n – 3) = T(n – 4) + 1
.
.
.
.
.
.
There are a variety of techniques for solving the recurrence equation (to solve for T(n)
without T(n-1) on right side of the equation) to get a closed form. We’ll start with the
simplest technique: Repeated Substitution
Solving the recurrence equation of the factorial form to get the closed form,
T(n)
= T(n – 1) + 1
= [T(n – 2) + 1] + 1
= [[T(n – 3) + 1] + 1] + 1
= ……
= T(n – i) + i
{ith level of substitution}
if ‘i = n’,
T(n) = T(0) + n
or,
T(n) = n + 1
7
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Example 2: Towers of Hanoi
Code:
procedure Hanoi (n: integer; from, to, aux: char)
if n > 0
Hanoi (n – 1, from, aux, to)
write (‘from’, from, ‘to’, to)
Hanoi (n – 1, aux, to, from)
Here,
T(0) = 0
T(n) = T(n – 1) + 1 + T(n – 1);
= 2T(n – 1) + 1
n>0
Solving the recurrence equation to get the closed form using repeated substitution,
T(n)
=
2T(n – 1) + 1
T(n – 1)
=
2T(n – 2) + 1
T(n – 2)
=
2T(n – 3) + 1
T(n – 3)
=
2T(n – 4) + 1
T(n)
= 2T(n – 1) + 1
= 2[2T(n – 2) + 1] + 1
= 2[2[2T(n – 3) + 1] + 1] + 1
= 2[22T(n – 3) + 21 + 20] + 20
= 23T(n – 3) + 22 + 21 + 20
Repeating ‘i’ times,
T(n)
= 2iT(n – i) + 2i-1 + 2i-2 + … + 20
If ‘i = n’,
T(n)
= 2nT(0) + 2n-1 + 2n-2 + … + 20
= 2n-1 + 2n-2 + … + 20
n 1
=
2
i
i0
This is a geometric series whose sum is given by:
Here, a = 1, r = 2, n = n. Therefore, Sn 
i.e.
T(n) = 2n – 1
Sn 
1  2n
= 2n – 1.
1 2
a  ar n
1 r
8
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Example 3: Binary Search
Code:
function BinSearch (var R: Ratype; a,b: indextype; x: keytype): boolean
var mid: indextype
if a > b
BinSearch:= false
else
mid:= (a + b) div 2
characteristic operation
if x = R[mid]
BinSearch:= true
else
if x < R[mid]
BinSearch:=BinSearch(R,a,mid–1,x)
else
BinSearch:=BinSearch(R,mid+1,b,x)
Here,
T(0) = 0
T(n) = 1
1  T ( (n  1) / 2  1)
1  T (n  (n  1) / 2  1)
if x = R[mid]
if x < R[mid]
if x > R[mid]
Complexity of
BinSearch(R, 1,n, x)
We can eliminate the floor function by limiting ‘n’ to 2k – 1; k  Z 
2k – 1
2k-1 – 1
Now,
2k-1 – 1
T(0) = 0
T(n) = 1
1  T (2 k 1  1)
if x = R[mid]
if x ≠ R[mid]
Best case occurs if x = R[mid] is true the very first time.
Thus,
B(n) = 1
9
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Worst case occurs if x = R[mid] is always false.
Now,
W(0) = 0
W(2k – 1) =
1  W (2 k 1  1)
Solving the recurrence equation to get the closed form using repeated substitution,
W(2k – 1)
=
=
=
1  W (2 k 1  1)
1 + [1  W (2 k  2  1) ]
1 + [1 + [1  W (2 k 3  1) ]]
Repeating ‘i’ times,
W(2k – 1)
=
i  W (2 k i  1)
=
k+0 =k
if ‘i = k’,
W(2k – 1)
We want W(n), so
2k – 1 = n
or,
lg 2k = lg (n + 1)
or,
k
= lg (n + 1)
{k in terms of n}
Thus,
W(n) = lg (n + 1)
Average Case Complexity
First we need to set up a model.
As a first step we’ll determine the average case complexity given that the element is
present in the array. Here, we take into account the probability of not finding an element
at a particular level (in worst case we assumed had taken this part to be one i.e. the
element being not found at a particular level was taken as a certainty) and add to it the
work done at the next level recursively.
Thus,
AElementPresent(1) = 1
1
(1  k
)  A(2 k 1  1)
AElementPresent (2k – 1) =
2 1
1
1
(1  k
)  [(1  k 1
)  A(2 k  2  1)]
=
2 1
2 1
1
1
1
1 k
 1  k 1
 1  k 2
 A(2 k 3  1)
=
2 1
2 1
2 1
10
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Repeating ‘i’ times,
AElementPresent (2k – 1) = [i] 
Let ‘i = k-1’,
1
1
1
1
 k 1
 k 2
... k ( i 1)
 A(2 k i  1)
2 1 2 1 2 1
2
1
k
k 2
[k  1]  1  
AElementPresent (2k – 1) =
i 0
k 2
k
=
i 0
k 2
However,
2
i0
k 2
2
i 0
1
k 1
1
k i
1
k i
1


0
2
1
1
2
k i
1
is a strictly increasing series, therefore its upper bound is given by
k 1
1
k x
2
1
k i
1
dx 

0
k 1

2
0

1
k  x dx
2
xk
2 xk
dx 
ln 2
k 1
0
k
1
2

2 ln 2 ln 2
Substituting, k = lg (n + 1) we have
1
2 k
1
2  lg( n 1)



2 ln 2 ln 2 2 ln 2
ln 2
1
1
1
1




lg(n 1)
2 ln 2 ln 2  2
2 ln 2 ln 2(n  1)
Now combining both the terms we have,
AElementPresent (n)
=
lg(n  1) 
1
1

2 ln 2 ln 2(n  1)
[
1
 0.7213 ]
2 ln 2
Given that:
p ≡ Pr (element exist in the array)
A(n)
p(average cost when element is present) + (1 – p) (worst cost)
1
1

p[ lg(n  1) 
] + (1 – p) [lg(n+1)]
2 ln 2 ln 2(n  1)
=
=
For example, if n = 1023,
A(n)
=
p[lg(1024) – 0.7213 +
=
p(9.3) + (1 – p)(10)
1
+ (1 – p)[lg(1024)
0.6932(1024)
11
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Solving Homogenous Linear Recurrence Relations Using Method of Characteristic
Roots (Another Method To Solve Recurrence Equations)
Homogenous Linear Recurrence Relation:
The term linear means that each term of the sequence is defined as a linear function of
the preceding terms. The coefficients and the constants may depend on n, even nonlinearly. Homogeneous means that the constant term of the relation is zero.
Example 1:
Recurrence Relation: an  5an1  6an2
a1  7
a0  1
General form:
an  c1 a n 1  c2 an  2 ... c k an  k  0
a k 1 ...
.
.
a0 ...
Let an  r n n
Substituting this in original recurrence relation, we have
r n  5r n1  6r n2
or,
r n  5r n1  6r n2  0
Dividing by rn-2 gives us,
r 2  5r  6  0
or,
(r – 2) (r – 3) = 0
=>
r1 = 2 r2 = 3
=>
an = 2n or, an = 3n;
both work
Therefore, general solution is:
an   2 n   3n
Now, consider the initial conditions:
or,  21   31  7
or, 2  3  7
a1  7
a0  1
or,  2 0   3 0  1
or,     1
----- (1)
or,   1   ----- (2)
12
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Substituting value of α from (2) in (1), we get
2(1   )  3  7
2  2   3  7
2  7
5
Therefore,
  1   1 5
or,   4
Thus, specific solution is:
an  5  3n  4  2 n
Cross-checking some values,
a0  5  3 0  4  2 0  1
a1  5  3  4  2  7
1
a2  5  3  4  2  29
2
a n  5a n 1  6 a n  2
1
2
These are same as
.
.
a 0  1; and , a1  7;
a 2  5  a1  6  a 2
or, a 2  5  7  6  1  29
an  5  3 n  4  2 n ...
Example 2: Fibonacci Sequence
Recurrence Relation: an  an1  an2
a1  1
a0  0
Let an  r n n
Substituting this in original recurrence relation, we have
r n  r n 1  r n  2
or,
r n  r n1  r n2  0
Dividing by rn-2 gives us,
r2  r 1  0
1 5
1 5
r1 
r2 
=>
2
2
1 5 n
1 5 n
an  (
) or, a n  (
) ;
=>
2
2
both work
13
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
an   (
Therefore, general solution is:
1 5 n
1 5 n
)  (
)
2
2
Now, consider the initial conditions:
1 5 1
1 5 1
)  (
) 1
or,  (
a1  1
2
2
a0  0
or,     0
or,   
----- (1)
----- (2)
Substituting value of α from (2) in (1), we get
1 5
1 5
 (
)  (
) 1
2
2
  (1  5 )   (1  5 )  2
   (  5 )    ( 5 )  2
2  5 )  2
 5 )  1
1
5
1
 
5

Thus, specific solution is:
an 
1 1 5 n
1 1 5 n
(
) 
(
)
2
2
5
5
Crosschecking values,
1 1 5 0
1 1 5 0
1
1
a0 
(
) 
(
) 

0
2
2
5
5
5
5
1 1 5 1 1 1 5 1
1 1 5 1 5
2 5
a1 
(
) 
(
) 
(
)
1
2
2
2
5
5
5
2 5
Example 3:
Recurrence Relation: an  an1  6an2
a1  8
a0  1
Let an  r n n
Substituting this in original recurrence relation, we have
r n  r n1  6r n2
14
CS 502
Analysis Of Algorithms
or,
Summer 2006
University of Bridgeport
r n  r n1  6r n2  0
Dividing by rn-2 gives us,
r2  r  6  0
=>
r1  2 r2  3
=>
an  (2) n or, an  3n ;
both work
Therefore, general solution is:
an   (2) n   3n
Now, consider the initial conditions:
or, 2  3  8
a1  8
a0  1
or,     1
----- (1)
or,   1  
----- (2)
Substituting value of α from (2) in (1), we get
2(1   )  3  8
2  5   8
5  10
2
   1  2  1
Thus, specific solution is:
an  1(2) n  2  3n
Crosschecking values,
a0  1(2) 0  2(3) 0  1  2  1
a1  1( 2)1  2(3)1  2  6  8
a2  1( 2) 2  2(3) 2  4  18  14
a3  1( 2) 3  2(3) 3  8  54  62
15
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Iterative Algorithms
Just as recursive algorithms lead naturally to recurrence equations, so iterative algorithms
lead naturally to formulas involving summations. The following examples use an iterative
approach to solve the given problem.
Example 1: Average Depth of BST Node
First we need to set up a model
Let n = 2k – 1 ≈ 2k
Given,
n = 16, k = 4
Approach 1
One way to look at it is that half the total number of nodes have depth 3, one-fourth of
total nodes have depth 2 and so on. Thus, total depth of all the nodes can be put in the
form:
1
1
1
1
 3   2  1   0
2
4
8
16
or,
k
k
1
k k i
(
k

i
)

 i


i
i
i 1 2
i 1 2
i 1 2
k
k
 k  2   i2  i
i
i 1
i 1
This gives us a model but we can find the formula. However, this approach uses
approximate values. The second approach we’ll see uses exact values and hence gives a
better model.
Approach 2
Another way to look at it is that 1 node has depth 0, 2 nodes have depth 1 and so on.
Here, the total depth of all the nodes is given by the form:
2 0  0  21  1  2 2  2  23  3
or,
k 1
k 1
i0
i 1
 i2 i   i2 i
But we know,
k
 i2
i
=
(k  1)2 k 1  2 (proof in handout: ‘Mathematical Tools’, Page 4)
i 1
16
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Substituting this value in the equation above, we have
k 1
 i2
i 1
i
 (k  2)2 k  2
 (log 2 (n  1)  2)(n  1)
[since n = 2k – 1]
Thus,
Average Depth of full BST Node =
(log 2 (n  1)  2)(n  1)
n
The following table shows some actual values of average depths of BST nodes for given
values of k:
k
4
5
6
7
8
9
10
11
12
13
4
15
16
n
15
31
63
127
255
511
1023
2047
4095
8191
16383
32767
65535
average depth
2.27
3.16
4.10
5.06
6.03
7.02
8.01
9.01
10.00
11.00
12.00
13.00
14.00
Example 2: Binary Search
Using iterative approach, the following model can be set up for the average number of
comparisons for binary search work.
Assume the total number of elements in an array, n = 2k – 1
Then,
Pr (comparisons = 1) =
Pr (comparisons = 2) =
.
.
Pr (comparisons = k) =
1
20

2k  1 2k  1
2
21

2k  1 2k  1
2 k 1
2k  1
17
CS 502
Analysis Of Algorithms
Summer 2006
University of Bridgeport
Therefore, assuming element is in array
Average number of comparisons

 # ofComparisons * Pr(# comparisons)
comp
k
2 i 1
 i  k
2 1
i 1
k

i  2
i 1
i 1
2k  1
Solving the numerator, we have
k
i  2
i 1
i 1
k
  i[2 i  2 i 1 ]
i 1
k
k
  i2   i2
i
i 1
k 1
k
  i 2   (i  1)2 i
i
i 1
i 1
k
k 1
k 1
i 1
i0
i0
i0
  i2 i   i2 i   2 i
 k2  2  1
This is a g.p. where,
A -> 1
R -> 2
N -> k
 ( k  1)2 k  1
Therefore, Sn =
i 1
k
k
Putting the entire expression together, we have
Average number of comparisons(assuming element is in array)




or,
Average number of comparisons(assuming element is in array)
12
k
12
(k  1)2 k  1
2k  1
(lg(n  1)  1)(n  1)  1
(n  1)  1
(n  1) lg(n  1)  n  1  1
n
(n  1) lg(n  1)  n
n
(n  1) lg(n  1)
1
n
As n   , the complexity approaches lg(n  1)  1 [This is the large N solution]
For example, if n = 15 (24 – 1)i.e. k = 4
2i 1
1  4  12  32 49


 3.266
k
15
15
2 1
i 1
(4  1)2 4  1
49
1 
 3.266 or,
Application of analytical solution gives us:
4
2 1
15
(15  1) lg(15  1)2 4
(16)(4)
49
1 
1 
 3.266
15
15
15
4
Numeric solution is given by:
i 
18
Download