Analysis of Algor ithms

advertisement
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Analysis of Algorithms
Page 1
J. Maluszynski, IDA, Linköpings Universitet, 2004.
[Lewis/Denenberg 2.1]
[Lewis/Denenberg 2.2]
[Lewis/Denenberg 1.3 (except of pp. 26-32)]
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Correctness
Page 2
“An algorithm must not give a wrong answer.”
J. Maluszynski, IDA, Linköpings Universitet, 2004.
[Lewis/Denenberg]
A function fact for computing factorial must not return 6 for the call fact 2 .
Which answers are wrong?
the user knows that, or
An algorithm is correct iff for any legal input
– memory
– time (focus of this course)
Analysis of time efficiency should be:
– machine-independent
– valid for all legal data
We compare:
– time growth-rate for growing size of (input) data (scalability)
– mostly for worst-case data
J. Maluszynski, IDA, Linköpings Universitet, 2004.
a specification of legal inputs and corresponding correct answers is needed.
What to analyze
correctness
termination
efficiency
Time efficiency
Mathematical background: comparing functions
growth rate
Page 4
the computation terminates, and
Page 3
analysis techniques for recursive algorithms
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Resources used by an algorithm:
Efficiency
n
1 n
Termination
1
2n
An algorithm should terminate:
TDDB57 DALG – Lecture 2: Analysis of algorithms.
worst case, expected case, amortized
J. Maluszynski, IDA, Linköpings Universitet, 2004.
the answer is as specified.
2
analysis techniques for iterative algorithms
1
Different algorithms may solve the same problem.
How to compare them?
TDDB28
1
produce an answer in a finite number of steps
for any legal input
0.
Termination is a difficult problem
Example:
Algorithm for squaring an integer using n2
function Square integer n : integer
if n 0 return 0
if n 0 return Square n 1 2 n
does not terminate for n
Page 5
J. Maluszynski, IDA, Linköpings Universitet, 2004.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
1 , key K : integer
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Analysis of running time: example
function TableSearch( table key T 0 n
(1) for i from 0 to n 1 do
(2)
if T i K then return i
(3)
if T i K then return 1
(4) return 1
What is the worst-case problem instance?
Worst case time:
n t1 t2 t3 t4
n characterizes the size of input data.
Page 7
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 6
Principles of Algorithm Analysis
J. Maluszynski, IDA, Linköpings Universitet, 2004.
An algorithm should work for (input) data of any size.
(Example TableSearch: input size is the size of the table.)
Estimate the used resource (time/memory) by
a function of the size of input.
Estimation : comparison(?) with well-known functions
Focus on the worst case performance (also average performance).
Ignore constant factors
analysis should be machine-independent;
more powerful computers introduce speed-up by constant factors.
Page 8
n log2 n
4
256
4096
n2
4
6 5 104
1 84 1019
2n
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Study scalability / asymptotic behaviour for large problem sizes:
ignore lower-order terms, focus on dominating terms.
TDDB57 DALG – Lecture 2: Analysis of algorithms.
n
2
64
384
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Growth of well-known functions
log2n
2
16
64
Comparing Efficiency of Algorithms
n
1
4
6
Functions describing execution time:
time: Nonnegative real numbers
size: Natural numbers
usually growing/non-decreasing
2
16
64
1 84 1019µsec = 2 14 108 days = 5845 centuries
How quickly grows the execution time of your algorithm?
More efficient algorithm: the function grows slower
but:
constant factors should be ignored: we can install faster computer
We need a formal basis for the comparison.
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 9
J. Maluszynski, IDA, Linköpings Universitet, 2004.
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 10
Order Notation: Formal Definition
Motivation for Order Notation
J. Maluszynski, IDA, Linköpings Universitet, 2004.
f is (in) O g iff there exist c
f n
cg n
f , g functions from natural numbers to positive real numbers
+ comparing growth rates of functions
classes of functions
Intuition: Apart from constant factors, f grows at most as quickly as g
0, n0 0 such that
for all n n0
Motivation:
+ estimating efficiency of algorithms by reference to simple functions
0, n0 0 such that
for all n n0
+ abstracting from constant factors
f is (in) Ω g iff there exist c
f n
cg n
f is (in) Θ g iff f n
Ogn
and g n
O f n
Page 12
0
f(n)
c g(n)
g(n)
n
n2 dominates g n
7n.)
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Intuition: Apart from constant factors, f grows exactly as quickly as g
Intuition: Apart from constant factors, f grows at least as quickly as g
Ω is the converse of O, i.e. f is in Ω g iff g is in O f
TDDB57 DALG – Lecture 2: Analysis of algorithms.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Dominance relation
Page 11
f(n)
o f ) iff
n
f(n)
g(n)
TDDB57 DALG – Lecture 2: Analysis of algorithms.
4 g(n)
g(n)
n
3 g(n)
n0
3 g(n)
n0
2 g(n)
n0
f dominates (grows more quickly than) g (g
f n gn
∞ for n ∞
3 g(n)
2 g(n)
that is, for any constant c 0, there is some n0
(Ex.: f n
such that f n c g n for all n n0.
O f but not vice versa.
o f then g
If g
4 g(n)
f(n)
g(n)
n
2 g(n)
n
n0
Relate f and g using O and Ω
n0
n0
Page 13
Θ g ,:
J. Maluszynski, IDA, Linköpings Universitet, 2004.
TDDB57 DALG – Lecture 2: Analysis of algorithms.
1
3
2
O n2
O n3
Prove or disprove
n
1
O 2n
Ωg, f
1
O n2
n
3n
n5
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Comparing functions
more...
Og, f
l
f n
lim
n
n ∞g
To check f
use definitions of O Ω Θ.
may be a bit complicated, or
∞
If
∞
0
l
O g iff l
Ω g iff l
Θ g iff 0
exists then
–f
–f
–f
Page 14
J. Maluszynski, IDA, Linköpings Universitet, 2004.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Page 16
TDDB57 DALG – Lecture 2: Analysis of algorithms.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Page 15
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Oh
Some properties of O
Comparing Growth Rates of Well-known Functions
O h then f
Some simple facts:
If f O g and g
transitivity
If f O g then also f g O g
growth rate depends only on fastest growing components
O f g
O g then f g
and g
O f
If f
nα O nβ iff α β α β 0
nα o nβ iff α β
growth rate of power function is determined by the value of power
If there are d 0, n0 0 such that f n
d for all n n0
then k f c O f for any constants k c 0.
growth rate does not depend on constant coefficients
d
logb n o nα for any b α 0
power functions grow more quickly than logarithms
o d n iff c
nα o cn for any α 0, c 1
exponential functions grow more quickly than power functions
cn
loga n O logb n for any a and b
growth rate of logarithms of various bases is equal
d,
O d n iff c
cn
Any constant function f n
c is in O 1
there is no difference in growth rate of constant functions
y1:n
!
#
$
t3
t4
t5
t6
n
y1:n
J. Maluszynski, IDA, Linköpings Universitet, 2004.
1
0
i
J. Maluszynski, IDA, Linköpings Universitet, 2004.
: table real
, with n
n
∑ ai j x j
j 1
nn
yi
Page 18
maxit t2
n
,
matrix A
with
n
t4
t1
t0
Worst case time:
where maxit = maximal number of iterations of the while loop
"
n2 t3
TDDB57 DALG – Lecture 2: Analysis of algorithms.
t2
Example: Independent Nested Loops
Matrix-vector product (here, for a quadratic matrix)
Time: n t1
that is,
Page 20
A x
given:
vector x
compute: vector y
TDDB57 DALG – Lecture 2: Analysis of algorithms.
y
procedure matvec table real x 1 n , A 1 n 1 n
(1) for i from 1 to n do
(2)
yi
00
(3)
for j from 1 to n do
(4)
yi
yi Ai j x j
return y
J. Maluszynski, IDA, Linköpings Universitet, 2004.
n
J. Maluszynski, IDA, Linköpings Universitet, 2004.
t4
j 1
n t3
1
Page 17
1
i
TDDB57 DALG – Lecture 2: Analysis of algorithms.
i
Estimating execution time for iterative programs
n
Remark: There exists a better, linear-time algorithm!
∑ xj
Elementary operation
takes / can be bound by a constant time
yi
Sequence of operations
takes the sum of the times of its components
Loop (for... and while...)
the time of the body multiplied by number of repetitions (in the worst case)
with
x 1 n ) : table integer
t4
Conditional statement (if...then...else...)
the time for evaluating and checking the condition
plus maximum of the times for then and else parts.
Page 19
TDDB57 DALG – Lecture 2: Analysis of algorithms.
1 2
nn 1
t3
2
Example: Complexity Analysis of Binary Search
Example: Dependent Nested Loops
t2
n
,
given: Vector x
compute: “Prefix-sums” vector y
t2
A straightforward algorithm follows directly from the definition:
n t1
n t1
Prefix-Sums
n
function BinSearch table key T 0 n 1 key K : integer
(0) if n 0 then return 1
(1) l 0; u n 1
(2) while l u do
(3)
mid
l u 2
(4)
if K T mid then return mid
(5)
if K T mid then u mid 1 else l mid 1
(6) if K T l then return l else return 1
procedure pre f ixsum table integer
(1) for i from 1 to n do
(2)
yi
00
(3)
for j from 1 to i do
(4)
yi
yi x j
return y
Total time: t n
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 22
t4
k1 n
c2
Page 24
k2
J. Maluszynski, IDA, Linköpings Universitet, 2004.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Example: comparing time complexity of search algorithms
t3
t2
n
(Why?)
n
(Why?)
log n
(Why?)
n
(Why?)
O
Ω
O
Θ
tTableSearch n
n t1
hence:
tTableSearch n
tTableSearch n
tTableSearch n
tTableSearch n
$
log2 n
O log n
Θ log n
On
J. Maluszynski, IDA, Linköpings Universitet, 2004.
n/2
n/2
#
tBinSearch n
c1
hence:
tBinSearch n
tBinSearch n
tBinSearch n
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Analysis of Recursive Programs (2)
Characterize execution time by a recurrence relation
Find solution (closed form, non-recursive)
of the recurrence relation
If not listed in a textbook, you may:
$
Page 21
?
TDDB57 DALG – Lecture 2: Analysis of algorithms.
1. Unroll the recurrence relation a few times
to get a hypothesis for a possible solution: T n
2. Prove the hypothesis for T n by induction.
If that fails, modify the hypothesis and try again ...
12
&'
Example continued
How to compute maxit for n
n
0,
1,
1,
2,
2,
2
1
2
3
4
5
6
3
J. Maluszynski, IDA, Linköpings Universitet, 2004.
T n
maxit
maxit
maxit
maxit
maxit
maxit
maxit n 2
1
maxit n
log2 n
maxit n
Page 23
#
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Analysis of Recursive Programs (1)
$
function f act integer n : integer
if n 0 then return 1
else return n f act n 1
Execution time:
time for comparison: tc
time for multiplication: tm
time for call and return neglected
(T is defined by a recurrence relation)
0
Total execution time T n
1 , if n
T n 2
tc tm
tm tc
T n
tm
T n
tc
tc
0:
%
T 0
T n
Hence for n
tm
tm
tc
(
Θn
tc
tc
tc
n times
tm
tm
tm
tm
tc
tc
tc
n tc
#
Page 25
Page 27
n
J. Maluszynski, IDA, Linköpings Universitet, 2004.
J. Maluszynski, IDA, Linköpings Universitet, 2004.
On
1 , key K ) : integer
nn 1
tc
2n
T 0 n
tc
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 26
2n
1 to
to
7to
1
3
Gives no information about the worst cases
Often difficult to analyze
J. Maluszynski, IDA, Linköpings Universitet, 2004.
Θ 2n
J. Maluszynski, IDA, Linköpings Universitet, 2004.
*
Hanoi
procedure Hanoi integer n char X Y Z :
move n topmost slices from tower X to tower Z, using Y as temporary
if n 1 then out put(“move X to Z”)
else
Hanoi n 1 X Z Y
out put(“move X to Z”)
Hanoi n 1 Y X Z
return
8T n
3to
to
2T n
2
T 1
T n
4T n
Page 28
T n
TDDB57 DALG – Lecture 2: Analysis of algorithms.
)
Average Case Analysis (2)
We have to know the probability distribution for the input data
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Towers of Hanoi
+
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Average Case Analysis (1)
Reconsider TableSearch(): sequential search through a table
Input argument:
one of the table elements,
assume it is chosen with equal probability for all elements.
function TableSearch table key
for i from 0 to n 1 do
if T i K then return i
2
3
n
Expected search time:
1
Page 29
J. Maluszynski, IDA, Linköpings Universitet, 2004.
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Page 30
Lab 1: Experimental evaluation of time complexity
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Amortized Analysis
Page 31
nn
2
1
tc
The total time for linear search of all elements is
Guaranteed!
TDDB57 DALG – Lecture 2: Analysis of algorithms.
Lab 1: A few questions
What kind of time complexity is estimated?
What about estimation of time complexity of another kind?
compare with plots of known functions
plot the results
repeat on data of various size
J. Maluszynski, IDA, Linköpings Universitet, 2004.
run the program on each of them, measure the time, take average
generate randomly several input data of a fixed size.
Is it possible to estimate complexity?
Done for sequences of operations and input data.
Is it necessary to generate many inputs of the same size?
J. Maluszynski, IDA, Linköpings Universitet, 2004.
A code may be
too complex for complexity analysis, or
unknown e.g. commercial procedure
Lab1:
Example:
Given: a sorted table T 0 n 1
Input: a permutation e1
en of all elements of T
Download