10. Run Time

advertisement
Discrete Maths
242-213, Semester 2, 2015-2016
10. Running Time
of Programs
• Objective
 to describe the Big-Oh notation for estimating the
running time of programs
1
Overview
1. Running Time
2. Big-Oh and Approximate Running Time
3. Big-Oh for Programs
4. Analyzing Function Calls
5. Analyzing Recursive Functions
6. Further Information
2
1. Running Time
• What is the running time of this program?
void main()
{ int i, n;
scanf("%d", &n);
for(i=0; i<n; i++)
printf("%d"\n", i);
}
continued
3
Counting Instructions
 Assume 1 instruction takes 1 ms
n value
no. of loops
Time (ms)
1
1
3
1,000
1,000
3,000
1,000,000
1,000,000
3,000,000
• There is no single answer!
 the running time depends on the size of the n value
• Instead of a time answer in seconds, we want a time
answer which is related to the size of the input.
continued
5
• For example:
 programTime(n) = constant * n
 this means that as n gets bigger, so does the program
time
 running time is linearly related to the input
running
time
constant * n
size of n
6
n value
no. of loops
Time (ms)
1
1
3
1,000
1,000
3,000
1,000,000
1,000,000
3,000,000
 A simple way of writing the running time is:
T(n) = 3n
Running Time Theory
• A program/algorithm has a running time T(n)
 n is some measure of the input size
• T(n) is the largest amount of time the program takes
on any input of size n
• Time units are left unspecified.
8
1.1. Kinds of Running Time
Worst-case: (we use this usually)
• T(n) = maximum time of algorithm on any input of size n.
- one possible value
Average-case: (we sometimes use this)
• T(n) = expected time of algorithm over all inputs of size n.
- this approach needs info about the statistical distribution
(probability) of the inputs.
- e.g. uniform spread of data (i.e. all data is equally likely)
Best-case: (don't use this, it's misleading)
• e.g. write a slow algorithm that works fast on specially selected input.
1.2.
T(n) Example
 Loop fragment for finding the product of all the
positive numbers in the A[] array of size n:
(2)
(3)
(4)
(5)
int prod = 1;
for(j = 0; j < n; j++)
if (A[j] > 0)
prod = prod * A[j];
 Count each assignment and test as 1 “time unit”.
Convert 'for' to 'while'
 The while-loop is easier to count (and equivalent to
the for-loop):
What about
counting the
loop?
int prod = 1;
// 1
int j = 0;
// 1
while (j < n) {
// 1 for the test
if (A[j] > 0)
// 1
prod = prod*A[j];
// 1 + 1
j++;
// 1
}
We assume that 1 instruction
takes 1 "time unit"
Calculation
• The for loop executes n times
 each loop carries out (in the worse case) 5 ops
•
test of j < n, if test, multiply, assign, j increment
 total loop time = 5n
 plus 3 ops at start and end
•
small assign (line 2), init of j (line 3), final j < n test
• Total time T(n) = 5n + 3
 running time is linear with the size of the array
12
1.3. Comparing Different T()’s
20000
T(n)
value
Tb(n) = 2n2
15000
10000
Ta(n) = 100n
5000
20
40
60
80
100
input
size n
• If input size < 50, program B is faster.
• But for large n’s, which are more common in
real code, program B gets worse and worse.
13
1.4. Common Growth Formulae & Names
• Formula (n = input size) Name
n
n2
n3
nm
mn ( m >= 2)
n!
1
log n
n log n
log log n
linear
quadratic
cubic
polynomial, e.g. n10
exponential, e.g. 5n
factorial
constant
logarithmic
14
1.5. Execution Times
Assume 1 instruction takes 1 microsec (10-6 secs) to execute.
How long will n instructions take?
n (no. of instructions)
growth formula T()
•
3
n
3
n2
9
n3
27
2n
8
log n 2
9
9
81
729
512
3
50
50
2.5ms
125ms
36yr
6
100
100
10ms
1sec
4*1016 yr
7
1000
1ms
1sec
16.7 min
3*10287 yr
10
106
1sec
12 days
31,710yr
3*10301016yr
20
if n is 50, you will
wait 36 years for an answer!
15
Notes
• Logarithmic running times are best.
• Polynomial running times are acceptable, if the
power isn’t too big
 e.g. n2 is ok, n100 is terrible
• Exponential times mean sloooooooow code.
 some size problems may take longer to finish than
the lifetime of the universe!
16
1.6. Why use T(n)?
• T() can guide our choice of which algorithm to
implement, or program to use
 e.g. selection sort or merge sort?
• T() helps us look for better algorithms in our own
code, without expensive implementation, testing, and
measurement.
17
T() is too Machine Dependent
 We want T() to be the same for an algorithm
independent of the machine where it is running.
 This is not true since different machines (and OSes)
execute instructions at different speeds.
 Consider the loop example (slide 10)
 on machine A, every instruction takes 1 "time unit"
 the result is TA(n) = 5n + 3
 On machine B, every instruction takes 1 "time unit"
except for multiplication, which takes 5 "time units".
 The for loop executes n times
 each loop carries out (in the worse case) 5 ops

test of j < n, if test, multiply, assign, j increment
 total loop time = 9n
 plus 3 ops at start and end

small assign (line 2), init of j (line 3), final j < n test
 Total time TB(n) = 9n + 3
 running time is linear with the size of the array
 TA() = 5n + 3 and TB() = 9n +3
 These are both linear equations (which is good), but
the constants are different (which is bad)
 We want a T() notation that is independent of
machines.
2. Big-Oh and Running Time
 Big-Oh notation for T(n) ignores constant factors
which depend on compiler/machine behaviour
 that's good
 Big-Oh simplifies the process of estimating the
running time of programs
 we don't need to count every code line
 that's also good
• The Big-Oh value specifies running time independent of:
 machine architecture
•
e.g. don’t consider the running speed of individual machine
operations
 machine load (usage)
•
e.g. time delays due to other users
 compiler design effects
•
e.g. gcc versus Visual C
22
Example
 When we counted instructions for the loop example,
we found:
 TA() = 5n + 3
 TB() = 9n + 3
 The Big-Oh equation, O(), is based on the T(n)
equation but ignores constants (which vary from
machine to machine). This means for both machine
A and B:
T(n) is O(n)
we say "T(n) is order n"
or "T(n) is about n"
More Examples
• T(n) value
 10n2+ 50n+100
 (n+1)2
 n10
 5n3 + 1
Big Oh value: O()
O(n2)
O(n2)
O(2n)
O(n3)
hard to
understand
• These simplifications have a mathematical reason,
which is explained in section 2.2.
24
2.1. Is Big-Oh Useful?
 O() ignores constant factors, which means it is a more
general measure of running time for algorithms across
different platforms/compilers.
 It can be compared with Big-Oh values for other
algorithms.
 i.e. linear is better than polynomial and exponential, but
worse than logarithmic
 i.e. O(log n) < O(n) < O(n2) < O(2n)
2.2.
Definition of Big-Oh
 T(n) is O( g(n) ) means:
 g(n) is the most important thing in T() when n is
large
 Example 1:
 T(n) = 5n + 3
 write as T(n) is O(n) // the g() function is n
 Example 2:
 T(n) = 9n + 3
 write as T(n) is O(n)
// the g() function is n
continued
More Formally
We write T(n) is O(g(n)) if there exist constants c > 0,
n0 > 0 such that 0  T(n)  c*g(n) for all n  n0.
 n0 and c are called witnesses to T(n) is O(g(n) )
O-notation as a Graph
 O-notation gives an
above
T(n)
T(n) is
upper bound for a
function to within a
constant factor. We write
T(n) is O(g(n)) if there
are positive constants n0
and c such that at and to
the right of n0, the value
of T(n) always lies on
or below c*g(n).
Example 1
the n2 part is
the most important thing in
the T() function
• T(n) = 10n2 + 50n + 100
 which allows that T(n) is O(n2)
• Why?
 Witnesses:
n0 = 1, c = 160
 then
T(n) <= c*g(n), n >= 1
so
10n2 + 50n + 100 <= 160 n2
since
10n2 + 50n + 100 <=
10n2 + 50n2 + 100n2 <= 160 n2
29
T() and O() Graphed
http://dlippman.imathas.com/
graphcalc/graphcalc.html
T(n)
T(n) is O(n2)
(c g(n) == 160n2)
above
T(n) = 10n2 + 50n + 10
n0 == 1
n
Example 2
• T(n) = (n+1)2
 which allows that T(n) is O(n2)
• Why?
 Witnesses:
n0 = 1, c = 4
 then
T(n) <= c*g(n), n >= 1
so
(n+1)2 <= 4n2
since
n2 + 2n + 1 <=
n2 + 2n2 + n2 <= 4n2
31
T() and O() Graphed
T(n)
T(n) is O(n2)
(c g(n) == 4n2)
above
T(n) = (n+1)2
n0 == 1
n
Example 3
• T(n) = n10
 which allows that T(n) is O(2n)
• Why?
 Witnesses:
n0 = 64, c = 1
 then
T(n) <= c*g(n), n >= 64
so
n10 <= 2n
since
10*log2 n <= n (by taking log2s)
which is true when n >= 64
(10*log2 64 == 10*6; 60 <= 64)
33
10
n
and
T(n)
n
2
Graphed
T(n) is O(2n)
(c g(n) == 2n
above
T(n) = n10
(58.770,
4.915E17)
n
3. Big-Oh for Programs
• First decide on a size measure for the data in the
program. This will become the n.
• Data Type
integer
string
array
Possible Size Measure
its value
its length
its length
35
3.1. Building a Big-Oh Result
• The Big-Oh value for a program is built up inductively
by:
 1) Calculate the Big-Oh’s for all the simple statements in
the program
•
e.g. assignment, arithmetic
 2) Then use those value to obtain the Big-Oh’s for the
complex statements
•
e.g. blocks, for loops, if-statements
36
Simple Statements (in C)
• We assume that simple statements always take a
constant amount of time to execute
 written as O(1)
 this is not a time unit (not 1 ms, not 1 microsec)
 O(1) means a running time independent of the input
size n
• Kinds of simple statements:
 assignment, break, continue, return, all library
functions (e.g. putchar(),scanf()), arithmetic, boolean
tests, array indexing
37
Complex Statements
• The Big-Oh value for a complex statement is a
combination of the Big-Oh values of its component
simple statements.
• Kinds of complex statements:
 blocks { ... }
 conditionals: if-then-else, switch
 loops: for, while, do-while
continued
38
3.2. Structure Trees
• The easiest way to see how complex statement timings
are based on simple statements (and other complex
statements) is by drawing a structure tree for the
program.
39
Example: binary conversion
(1)
(2)
(3)
(4)
(5)
void main()
{ int i;
scanf(“%d”, &i);
while (i > 0) {
putchar(‘0’ + i%2);
i = i/2;
}
putchar(‘\n’);
}
40
Structure Tree for Example
block
1-5
1
5
while
2-4
the time for this
is the time for
(3) + (4)
block
3-4
3
4
41
3.3. Details for Complex Statements
• Blocks: Running time bound =
summation of the bounds of
its parts.
"summation" means 'add'
• The summation rule means that only the
largest Big-Oh value is considered.
42
Block Calculation Graphically
O( f1(n) )
O( f2(n) )
summation rule
O( f1(n) + f2(n) + ... + fk(n))
In other words:
O( largest fi(n) )
O( fk(n) )
The cost of a block is
the cost of the biggest
statement.
43
Block Summation Rule Example
• First block's time T1(n) = O(n2)
• Second block's time T2(n) = O(n)
• Total running time = O(n2 + n)
= O(n2)
the largest part
44
Conditionals
e.g. if statements, switches
• Conditionals: Running time bound =
the cost of the if-test +
larger of the bounds for the
if- and else- parts
• When the if-test is a simple statement (a
boolean test), it is O(1).
45
Conditional Graphically
O(1)
Test
O( max( f1(n), f2(n)) +1 )
which is the same as
O( max( f1(n), f2(n)) )
If
Part
Else
Part
O( f1(n) )
O( f2(n) )
The cost of an ifstatement is the cost
of the biggest branch.
46
If Example
• Code fragment:
if (x < y)
foo(x);
else
bar(y);
// O(1)
// O(n)
// O(n2)
• Total running time = O( max(n,n2) + 1)
= O(n2 + 1)
= O(n2)
47
Loops
• Loops: Running time bound is usually =
the max. number of times round the loop *
the time to execute the loop body once
• But we must include O(1) for the increment and test
each time around the loop.
• Must also include the initialization and final test costs
(both O(1)).
48
While Graphically
Test
At most
g(n) times
around
Altogether this is:
O( g(n)*(f(n)+1) + 1 )
which can be simplified to:
O( g(n)*f(n) )
O(1)
O( g(n)*f(n) )
Body
O( f(n) )
The cost of a loop is
the cost of the body *
the number of loops.
49
While Loop Example
• Code fragment:
x = 0;
while (x < n) {
// O(1) for test
foo(x, n);
// O(n2)
x++;
// O(1)
}
• Total running time of loop:
= O( n*( 1 + n2 + 1) + 1 )
= O(n3 + 2n + 1) = O(n3)
lp(n) = n
f(n) = 1 + n2 + 1
50
For-loop Graphically
At most
g(n) times
around
Initialize
O(1)
Test
O(1)
Body
O( f(n) )
Increment
O(1)
O( g(n)*(f(n)+1+1) + 1)
which can be
simplified to:
O( g(n)*f(n) )
The cost of a loop is
the cost of the body *
the number of loops.
51
For Loop Example
• Code Fragment:
for (i=0; i < n; i++)
foo(i, n);
// O(n2)
• It helps to rewrite this as a while loop:
i=0;
while (i < n) {
foo(i, n);
i++;
}
// O(1)
// O(1) for test
// O(n2)
// O(1)
continued
52
• Running time for the for loop:
= O( 1 + n*( 1 + n2 + 1) + 1 )
= O( 2 + n3 + 2n )
= O(n3)
lp(n) = n
f(n) = 1 + n2 + 1
53
3.4.1. Example: nested loops
(1)
(2)
(3)
for(i=0; i < n; i++)_
for (j = 0; j < n; j++)
A[i][j] = 0;
• line (3) is a simple op - takes O(1)
• line (2) is a loop carried out n times
 takes O(n *1) = O(n)
• line (1) is a loop carried out n times
 takes O(n * n) = O(n2)
54
3.4.2. Example: if statement
(1)
(2)
(3)
(4)
(5)
(6)
(7)
if (A[0][0] == 0) {
for(i=0; i < n; i++)_
for (j = 0; j < n; j++)
A[i][j] = 0;
}
else {
for (i=0; i < n; i++)
A[i][i] = 1;
}
continued
55
• The if-test takes O(1);
the if block takes O(n2);
the else block takes O(n).
• Total running time:
= O(1) + O( max(n2, n) )
= O(1) + O(n2)
= O(n2)
// using the summation rule
56
3.4.3. Time for a Binary Conversion
(1)
(2)
(3)
(4)
(5)
void main()
{ int i;
scanf(“%d”, &i);
while (i > 0) {
putchar(‘0’ + i%2);
i = i/2;
}
putchar(‘\n’);
}
continued
57
• Lines 1, 2, 3, 4, 5: each O(1)
• Block of 3-4 is O(1) + O(1) = O(1)
why?
• While of 2-4 loops at most (log2 i)+1 times
 total running time = O(1 * log2 i+1) = O(log2 i)
• Block of 1-5:
= O(1) + O(log2 i) + O(1)
= O(log2 i)
58
Why (log2 i)+1 ?
• Assume i = 2k
• Start 1st iteration, i = 2k
Start 2nd iteration, i = 2k-1
Start 3rd iteration, i = 2k-2
Start kth iteration, i = 2k-(k-1) = 21 = 2
Start k+1th iteration, i = 2k-k = 20 = 1
 the while will terminate after this iteration
• Since 2k = i, so k = log2 i
• So k+1, the no. of iterations, = (log2 i)+1
59
Using a Structure Tree
block
O(log2 i)
1-5
1
5
while
2-4
O(1)
O(1)
O(log2 i)
block O(1)
3-4
3
O(1)
4
O(1)
60
3.4.4. Time for a Selection Sort
void selectionSort(int A[], int n)
{
int i, j, small, temp;
(1) for (i=0; i < n-1; i++) { // outer loop
(2)
small = i;
(3)
for( j= i+1; j < n; j++) // inner
loop
(4)
if (A[j] < A[small])
(5)
small = j;
(6)
temp = A[small]; // exchange
(7)
A[small] = A[i];
(8)
A[i] = temp;
}
}
61
Selection Sort Structure Tree
for
1-8
block
2-8
2
for
3-5
6
7
8
if
4-5
5
62
if part
else part
• Lines 2, 5, 6, 7, 8: each is O(1)
• If of 4-5 is O(max(1,0)+1) = O(1)
• For of 3-5 is O( (n-(i+1))*1) = O(n-i-1)
= O(n), simplified
• Block of 2-8
= O(1) + O(n) + O(1) + O(1) + O(1) = O(n)
• For of 1-8 is:
= O( (n-1) * n) = O(n2 - n)
= O(n2), simplified
63
4. Analyzing Function calls
• In this section, we assume that the functions are
not
recursive
 we add recursion in section (5)
• Size measures for all the functions must be similar, so
they can be combined to give the program’s Big-Oh
value.
64
Example Program
#include <stdio.h>
int bar(int x, int n);
int foo(int x, int n):
void main()
{ int a, n;
(1)
scanf(“%d”, &n);
(2)
a = foo(0, n);
(3)
printf(“%d\n”, bar(a,n));
}
continued
65
int bar(int x, int n)
{ int i;
(4)
for(i = 1; i <= n; i++)
(5)
x += i;
(6)
return x;
}
int foo(int x, int n)
{ int i;
(7)
for(i = 1; i <= n; i++)
(8)
x += bar(i, n);
(9)
return x;
}
66
Calling Graph
main
foo
bar
67
Calculating Times with a Calling Graph
• 1. Calculate times for Group 0 functions
 those that call no other user functions
• 2. Calculate times for Group 1 functions
 those that call Group 0 functions only
• 3. Calculate times for Group 2 functions
 those that call Group 0 and Group 1 functions only
• 4. Continue until the time for main() is obtained.
68
Example Program Analysis
• Group 0: bar() is O(n)
bar() in body
• Group 1: foo() is O( n * n) = O(n2)
• Group 2: main() is
= O(1) + O(n2) + O(1) + O(n)
= O(n2)
69
5. Analyzing Recursive Functions
• Recursive functions call themselves with a “smaller
size” argument, and terminate by calling a base
case.
int factorial(int n)
{ if (n <= 1)
return 1;
else
return n*factorial(n-1);
}
70
Running Time for a Recursive Function
• 1. Develop basis and inductive statements for the
running time.
• 2. Solve the corresponding recurrence relation.
 this usually requires the Big-Oh notation to be
rewritten as constants and multiples of n
 e.g. O(1) becomes a, O(n) becomes b*n,
O(n2) becomes c*n2, etc.
continued
71
• 3. Translate the solved relation back into Big-Oh
notation
 rewrite the remaining constants back into Big-Oh
form
 e.g. a becomes O(1), b*n becomes O(n)
72
5.1. Factorial Running Time
• Step 1.
 Basis: T(1) = O(1)
 Induction: T(n) = O(1) + T(n-1), for n > 1
• Step 2.
 Simplify the relation by replacing the O() notation
with constants.
 Basis: T(1) = a
 Induction: T(n) = b + T(n-1), for n > 1
73
• The simplest way to solve T(n) is to calculate it for
some values of n, and then guess the general
expression.
T(1) = a
T(2) = b + T(1) = b + a
T(3) = b + T(2) = 2b + a
T(4) = b + T(3) = 3b + a
• “Obviously”, the general form is:
T(n) = ((n-1)*b) + a
= bn + (a-b)
continued
74
• Step 3. Translate back:
T(n) = bn + (a-b)
• Replace constants by Big-Oh notation:
T(n) = O(n) + O(1)
= O(n)
• The running time for recursive factorial is
O(n). That is fast.
75
5.2. Recursive Selection Sort
void rSSort(int A[], int n)
{ int imax, i;
if (n == 1)
return;
else {
imax = 0;
/* A[0] is biggest */
for (i=1; i < n; i++)
if (A[i] > A[imax])
imax = i;
swap(A, n-1, imax);
rSSort(A, n-1);
}
}
76
Running Time
n == the size
of the array
Assume swap() is O(1), so ignore
the
loop
call to
rSSort()
• Step 1.
 Basis: T(1) = O(1)
 Induction: T(n) = O(n-1) + T(n-1), for n > 1
multiple of n-1
• Step 2.
 Basis: T(1) = a
 Induction: T(n) = b(n-1) + T(n-1), for n > 1
continued
77
• Solve the relation:
 T(1) = a
T(2) = b + T(1) = b + a
T(3) = 2b + T(2) = 2b + b + a
T(4) = 3b + T(3) = 3b + 2b + b + a
• General Form:
 T(n) = (n-1)b + ... + b + a
n 1
T ( n)  a  b  i
i 0
= a + b(n-1)n/2
continued
78
• Step 3. Translate back:
T(n) = a + b(n-1)n/2
• Replace constants by Big-Oh notation:
T(n) = O(1) + O(n2) + O(n)
= O(n2)
• The running time for recursive selection sort is
O(n2). That is slow for large arrays.
79
6. Further Information
• Discrete Mathematics and its Applications
Kenneth H. Rosen
McGraw Hill, 2007, 7th edition
• chapter 3, sections 3.2 – 3.3
80
Download