part1-complexity

advertisement
CSE 326: Data Structures
Introduction &
Part One: Complexity
Henry Kautz
Autumn Quarter 2002
Overview of the Quarter
• Part One: Complexity
–
–
–
–
•
•
•
•
•
•
•
inductive proofs of program correctness
empirical and asymptotic complexity
order of magnitude notation; logs & series
analyzing recursive programs
Part Two: List-like data structures
Part Three: Sorting
Part Four: Search Trees
Part Five: Hash Tables
Part Six: Heaps and Union/Find
Part Seven: Graph Algorithms
Part Eight: Advanced Topics
Material for Part One
• Weiss Chapters 1 and 2
• Additional material
– Graphical analysis
– Amortized analysis
– Stretchy arrays
• Any questions on course organization?
Program Analysis
• Correctness
– Testing
– Proofs of correctness
• Efficiency
– How to define?
– Asymptotic complexity - how running times scales as
function of size of input
Proving Programs Correct
• Often takes the form of an inductive proof
• Example: summing an array
int sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
What are the parts of an inductive proof?
Inductive Proof of Correctness
int sum(int v[], int n)
{
if (n==0) return 0;
else return v[n-1]+sum(v,n-1);
}
Theorem: sum(v,n) correctly returns sum of 1st n elements of
array v for any n.
Basis Step: Program is correct for n=0; returns 0. 
Inductive Hypothesis (n=k): Assume sum(v,k) returns sum of
first k elements of v.
Inductive Step (n=k+1): sum(v,k+1) returns v[k]+sum(v,k),
which is the same of the first k+1 elements of v. 
Proof by Contradiction
• Assume negation of goal, show this leads to a
contradiction
• Example: there is no program that solves the
“halting problem”
– Determines if any other program runs forever or not
Alan Turing, 1937
Program NonConformist (Program P)
If ( HALT(P) = “never halts” ) Then
Halt
Else
Do While (1 > 0)
Print “Hello!”
End While
End If
End Program
• Does NonConformist(NonConformist) halt?
• Yes? That means
HALT(NonConformist) = “never halts”
• No? That means
HALT(NonConformist) = “halts”
Defining Efficiency
• Asymptotic Complexity - how running time scales
as function of size of input
• Why is this a reasonable definition?
Defining Efficiency
• Asymptotic Complexity - how running time scales
as function of size of input
• Why is this a reasonable definition?
– Many kinds of small problems can be solved in practice
by almost any approach
• E.g., exhaustive enumeration of possible solutions
• Want to focus efficiency concerns on larger problems
– Definition is independent of any possible advances in
computer technology
Technology-Depended Efficiency
• Drum Computers: Popular technology from early
1960’s
• Transistors too costly to use for RAM, so memory
was kept on a revolving magnetic drum
• An efficient program scattered instructions on the
drum so that next instruction to execute was under
read head just when it was needed
– Minimized number of full revolutions
of drum during execution
The Apocalyptic Laptop
Speed  Energy Consumption
E=mc2
25 million megawatt-hours
Quantum mechanics:
Switching speed =  h / (2 * Energy)
h is Planck’s constant
5.4 x 10 50 operations per second
Seth Lloyd, SCIENCE, 31 Aug 2000
Big Bang
Ultimate Laptop,
1 year
1 second
1E+60
1E+55
1E+50
2^N
1.2^N
N^5
N^3
5N
1E+45
1E+40
1000 MIPS,
since Big Bang
1E+35
1E+30
1E+25
1E+20
1000 MIPS,
1 day
1E+15
1E+10
100000
1
1
10
100
1000
Defining Efficiency
• Asymptotic Complexity - how running time scales as
function of size of input
• What is “size”?
– Often: length (in characters) of input
– Sometimes: value of input (if input is a number)
• Which inputs?
– Worst case
• Advantages / disadvantages ?
– Best case
• Why?
Average Case Analysis
• More realistic analysis, first attempt:
– Assume inputs are randomly distributed according to
some “realistic” distribution 
– Compute expected running time
E (T , n) 

xInputs( n )
Prob  ( x) RunTime( x)
– Drawbacks
• Often hard to define realistic random distributions
• Usually hard to perform math
Amortized Analysis
• Instead of a single input, consider a sequence of
inputs
• Choose worst possible sequence
• Determine average running time on this sequence
• Advantages
– Often less pessimistic than simple worst-case analysis
– Guaranteed results - no assumed distribution
– Usually mathematically easier than average case analysis
Comparing Runtimes
• Program A is asymptotically less efficient than
program B iff
the runtime of A dominates the runtime of B, as the size
of the input goes to infinity
 RunTime( A, n) 

   as n  
 RunTime( B, n) 
• Note: RunTime can be “worst case”, “best case”,
“average case”, “amortized case”
Which Function Dominates?
n3 + 2n2
100n2 + 1000
n0.1
log n
n + 100n0.1
2n + 10 log n
5n5
n!
n-152n/100
1000n15
82log n
3n7 + 7n
Race I
n3 + 2n2
vs. 100n2 + 1000
Race II
n0.1
vs.
log n
Race III
n + 100n0.1
vs. 2n + 10 log n
Race IV
5n5
vs.
n!
Race V
n-152n/100
vs.
1000n15
Race VI
82log(n)
vs.
3n7 + 7n
Order of Magnitude Notation
(big O)
• Asymptotic Complexity - how running time scales
as function of size of input
– We usually only care about order of magnitude of
scaling
• Why?
Order of Magnitude Notation
(big O)
• Asymptotic Complexity - how running time scales
as function of size of input
– We usually only care about order of magnitude of
scaling
• Why?
– As we saw, some functions overwhelm other functions
• So if running time is a sum of terms, can drop dominated terms
– “True” constant factors depend on details of compiler
and hardware
• Might as well make constant factor 1
16n log8 (10n )  100n  O(n log(n))
3
2
2
3
16n3 log8 (10n 2 )  100n 2
• Eliminate low
order terms
• Eliminate
constant
coefficients
 16n3 log8 (10n 2 )
 n3 log8 (10n 2 )
 n3 log8 (10)  log 8 ( n 2 ) 
 n3 log8 (10)  n3 log8 (n 2 )
 n3 log8 (n 2 )
 n3 2 log8 (n)
 n3 log8 (n)
 n3 log8 (2) log( n)
 n3 log(n)
Common Names
constant:
logarithmic:
linear:
log-linear:
quadratic:
exponential:
Slowest Growth
O(1)
O(log n)
O(n)
O(n log n)
O(n2)
O(cn)
(c is a constant > 1)
Fastest Growth
superlinear:
polynomial:
O(nc)
O(nc)
(c is a constant > 1)
(c is a constant > 0)
Summary
• Proofs by induction and contradiction
• Asymptotic complexity
• Worst case, best case, average case, and amortized
asymptotic complexity
• Dominance of functions
• Order of magnitude notation
• Next:
– Part One: Complexity, continued
– Read Chapters 1 and 2
Part One: Complexity, continued
Friday, October 4th, 2002
Determining the Complexity of
an Algorithm
• Empirical measurement
• Formal analysis (i.e. proofs)
• Question: what are likely advantages and
drawbacks of each approach?
Determining the Complexity of
an Algorithm
• Empirical measurement
• Formal analysis (i.e. proofs)
• Question: what are likely advantages and
drawbacks of each approach?
– Empirical:
• pro: discover if constant factors are significant
• con: may be running on “wrong” inputs
– Formal:
• pro: no interference from implementation/hardware details
• con: can make mistake in a proof!
In theory, theory is the same as
practice, but not in practice.
Measuring Empirical Complexity:
Linear vs. Binary Search
• Find a item in a sorted array of length N
• Binary search algorithm:
Linear Search
Time to find
one item:
Time to find N
items:
Binary Search
void bfind(int x, int a[], int n)
{
m = n / 2;
if (x == a[m]) return;
if (x < a[m])
bfind(x, a, m);
else
bfind(x, &a[m+1], n-m-1); }
void lfind(int x, int a[], int n)
{ for (i=0; i<n; i++)
if (a[i] == x)
return; }
for (i=0; i<n; i++) a[i] = i;
for (i=0; i<n; i++) lfind(i,a,n);
or bfind
My C Code
Graphical Analysis
linear vs binary search
0.050
seconds
0.040
0.030
0.020
linear
0.010
binary
0.000
0
10
20
30
40
50
60
70
N
80
90
100 110 120
Graphical Analysis
linear vs binary search
0.050
seconds
0.040
0.030
0.020
linear
0.010
binary
0.000
1,
0
0
0
0
0
0
0
0
0
0
00
90
80
70
60
50
40
30
20
10
0
N
seconds
linear vs binary search
5,000
4,500
4,000
3,500
3,000
2,500
2,000
1,500
1,000
500
0
linear
binary
0
25
0,0
00
50
0,0
00
N
75
0,0
00
1, 0
00
, 00
0
linear vs binary search - log/log plot
10,000
seconds
1,000
linear
100
binary
10
1
0
0
1
10
10
0
1, 0
10
10
1, 0
10
,
0
,00
00
00
00
,
0
, 00
0
00
0 ,0
0
00
N
linear vs binary search - log/log plot
slope  2
10,000
seconds
1,000
linear
100
binary
10
1
slope  1
0
0
1
10
10
0
1, 0
10
10
1, 0
10
,
0
,00
00
00
00
,
0
, 00
0
00
0 ,0
0
00
N
Property of Log/Log Plots
• On a linear plot, a linear function is a straight line
• On a log/log plot, any polynomial function is a
straight line!
– The slope y/ x is the same as the exponent
Proof: Suppose y  cx k
Then log y  log(cx k )
log y  log c  log x
k
horizontal axis
log y  log c  k log x
vertical axis
slope
Why doeslinear
O(n vs
logbinary
n) look
like- log/log
a straight
search
plot line?
10,000
seconds
1,000
linear
100
binary
10
1
slope  1
0
0
1
10
10
0
1, 0
10
10
1, 0
10
,
0
,00
00
00
00
,
0
, 00
0
00
0 ,0
0
00
N
Summary
• Empirical and formal analyses of runtime scaling
are both important techniques in algorithm
development
• Large data sets may be required to gain an
accurate empirical picture
• Log/log plots provide a fast and simple visual tool
for estimating the exponent of a polynomial
function
Formal Asymptotic Analysis
• In order to prove complexity results, we must
make the notion of “order of magnitude” more
precise
• Asymptotic bounds on runtime
– Upper bound
– Lower bound
Definition of Order Notation
•
Upper bound: T(n) = O(f(n))
Big-O
Exist constants c and n’ such that
T(n)  c f(n) for all n  n’
•
Lower bound: T(n) = (g(n))
Omega
Exist constants c and n’ such that
T(n)  c g(n) for all n  n’
•
Tight bound: T(n) = θ(f(n))
When both hold:
T(n) = O(f(n))
T(n) = (f(n))
Theta
Example: Upper Bound
Claim: n 2  100n  O (n 2 )
Proof: Must find c, n such that for all n  n,
n 2  100n  cn 2
Let's try setting c  2. Then
n 2  100n  2n 2
100n  n 2
100  n
So we can set n  100 and reverse the steps above.
Using a Different Pair of Constants
Claim: n 2  100n  O(n 2 )
Proof: Must find c, n such that for all n  n,
n  100n  cn
Let's try setting c  101. Then
2
2
n  100n  100n
2
2
n  100  101n (divide both sides by n)
100  100n
1 n
So we can set n  1 and reverse the steps above.
Example: Lower Bound
Claim: n 2  100n  (n 2 )
Proof: Must find c, n such that for all n  n,
n 2  100n  cn 2
Let's try setting c  1. Then
n 2  100n  n 2
n0
So we can set n  0 and reverse the steps above.
Thus we can also conclude n 2  100n   (n 2 )
Conventions of Order Notation
Order notation is not symmetric: write 2n 2  n  O(n 2 )
but never O(n 2 )  2n 2  n
The expression O( f (n))  O( g (n)) is equivalent to
f (n)  O( g (n))
The expression ( f ( n))  ( g ( n)) is equivalent to
f (n)  ( g (n))
The right-hand side is a "cruder" version of the left:
18n 2  O(n 2 )  O(n3 )  O(2n )
18n 2  (n 2 )  (n log n)  (n)
Upper/Lower vs. Worst/Best
• Worst case upper bound is f(n)
– Guarantee that run time is no more than c f(n)
• Best case upper bound is f(n)
– If you are lucky, run time is no more than c f(n)
• Worst case lower bound is g(n)
– If you are unlikely, run time is at least c g(n)
• Best case lower bound is g(n)
– Guarantee that run time is at least c g(n)
Analyzing Code
•
•
•
•
•
•
primitive operations
consecutive statements
function calls
conditionals
loops
recursive functions
Conditionals
• Conditional
if C then S1 else S2
• Suppose you are doing a O( ) analysis?
• Suppose you are doing a ( ) analysis?
Conditionals
• Conditional
if C then S1 else S2
• Suppose you are doing a O( ) analysis?
Time(C) + Max(Time(S1),Time(S2))
or Time(C)+Time(S1)+Time(S2)
or …
• Suppose you are doing a ( ) analysis?
Time(C) + Min(Time(S1),Time(S2))
or Time(C)
or …
Nested Loops
for i = 1 to n do
for j = 1 to n do
sum = sum + 1
Nested Loops
for i = 1 to n do
for j = 1 to n do
sum = sum + 1
n
n
n
1   n  n
i 1 j 1
i 1
2
Nested Dependent Loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
Nested Dependent Loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
n
n
1

?

i 1 j i
Summary
• Formal definition of order of magnitude notation
• Proving upper and lower asymptotic bounds on a
function
• Formal analysis of conditionals and simple loops
• Next:
– Analyzing complex loops
– Mathematical series
– Analyzing recursive functions
Part One: Complexity,
Continued
Monday October 7, 2002
Today’s Material
•
•
•
•
•
•
•
Running time of nested dependent loops
Mathematical series
Formal analysis of linear search
Formal analysis of binary search
Solving recursive equations
Stretchy arrays and the Stack ADT
Amortized analysis
Nested Dependent Loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
n
n
1

?

i 1 j i
Nested Dependent Loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
n
n
n
1   (n  i  1)
i 1 j i
n
i 1
n
n
n
  n   i  1  n   i  n
2
i 1
i 1
i 1
i 1
Arithmetic Series
N
S(N )  1 2  N   i  ?
i 1
• Note that: S(1) = 1, S(2) = 3, S(3) = 6, S(4) = 10, …
• Hypothesis: S(N) = N(N+1)/2
Prove by induction
–
–
–
–
Base case: for N = 1, S(N) = 1(2)/2 = 1 
Assume true for N = k
Suppose N = k+1.
S(k+1) = S(k) + (k+1)
= k(k+1)/2 + (k+1)
= (k+1)(k/2 + 1)
= (k+1)(k+2)/2.

Other Important Series
• Sum of squares:
N ( N  1)(2 N  1) N 3
i 

for large N

6
3
i 1
N
2
• Sum of exponents:
N k 1
i 
for large N and k  -1

| k 1|
i 1
• Geometric series:
A N 1  1
A 

A 1
i 0
N
k
N
i
• Novel series:
– Reduce to known series, or prove inductively
Nested Dependent Loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
n(n  1)
n  i  n  n 
n
2
i 1
n(n  1) n(n  1)
 n(n  1) 

2
2
2
2
 n / 2  n / 2   (n )
n
2
2
Linear Search Analysis
void lfind(int x, int a[], int n)
{ for (i=0; i<n; i++)
if (a[i] == x)
return; }
• Best case, tight analysis:
• Worst case, tight analysis:
Iterated Linear Search Analysis
for (i=0; i<n; i++) a[i] = i;
for (i=0; i<n; i++) lfind(i,a,n);
• Easy worst-case upper-bound:
• Worst-case tight analysis:
Iterated Linear Search Analysis
for (i=0; i<n; i++) a[i] = i;
for (i=0; i<n; i++) lfind(i,a,n);
• Easy worst-case upper-bound: nO(n)  O(n 2 )
• Worst-case tight analysis:
– Just multiplying worst case by n does not justify
answer, since each time lfind is called i is specified
n(n  1)
2
1  i 
  (n )

2
i 1 j 1
i 1
n
i
n
Analyzing Recursive Programs
1. Express the running time T(n) as a recursive
equation
2. Solve the recursive equation
•
•
For an upper-bound analysis, you can optionally
simplify the equation to something larger
For a lower-bound analysis, you can optionally
simplify the equation to something smaller
Binary Search
void bfind(int x, int a[], int n)
{
m = n / 2;
if (x == a[m]) return;
if (x < a[m])
bfind(x, a, m);
else
bfind(x, &a[m+1], n-m-1); }
What is the worst-case upper bound?
Binary Search
void bfind(int x, int a[], int n)
{
m = n / 2;
if (x == a[m]) return;
if (x < a[m])
bfind(x, a, m);
else
bfind(x, &a[m+1], n-m-1); }
What is the worst-case upper bound?
Trick question: 
Binary Search
void bfind(int x, int a[], int n)
{
m = n / 2;
if (n <= 1) return;
if (x == a[m]) return;
if (x < a[m])
bfind(x, a, m);
else
bfind(x, &a[m+1], n-m-1); }
Okay, let’s prove it is (log n)…
Binary Search
void bfind(int x, int a[], int n)
{
m = n / 2;
if (n <= 1) return;
if (x == a[m]) return;
if (x < a[m])
bfind(x, a, m);
else
bfind(x, &a[m+1], n-m-1); }
Introduce some constants…
b = time needed for base case
c = time needed to get ready to do a recursive call
Running time is thus:
T (1)  b
T (n)  T (n / 2)  c
Binary Search Analysis
One sub-problem, half as large
Equation:
T(1)  b
T(n)  T(n/2) + c
for n>1
Solution:
T(n)  T(n/2) + c
 T(n/4) + c + c
 T(n/8) + c + c + c
 T(n/2k) + kc
 T(1) + c log n where k = log n
 b + c log n = O(log n)
write equation
expand
inductive leap
select value for k
simplify
Solving Recursive Equations by
Repeated Substitution
• Somewhat “informal”, but intuitively clear and
straightforward
substitute for T(n/2)
T (n)  T (n / 2)  c
T (n)  T (n / 4)  c  c
substitute for T(n/4)
T (n)  T (n / 4)  c  c
T (n)  T (n / 8)  c  c  c
T (n)  T (n / 2k )  kc
"inductive leap"
choose k=log n
T (n)  T (n / 2log n )  c log n
T (n)  T (n / n)  c log n
 T (1)  c log n  b  c log n   (log n)
Solving Recursive Equations by
Telescoping
• Create a set of equations, take their sum
T (n)  T (n / 2)  c
initial equation
T (n / 2)  T (n / 4)  c
so this holds...
T (n / 4)  T (n / 8)  c
and this...
T (n / 8)  T (n /16)  c
and this...
...
and eventually...
T (2)  T (1)  c
sum equations, cancelling
terms that appear on both sides
T (n)  T (1)  c log n
look familiar?
T (n)   (log n)
Solving Recursive Equations by
Induction
• Repeated substitution and telescoping construct the
solution
• If you know the closed form solution, you can
validate it by ordinary induction
• For the induction, may want to increase n by a
multiple (2n) rather than by n+1
Inductive Proof
T (1)  b  c log1  b
base case
Assume T (n)  b  c log n
hypothesis
T (2n)  T (n)  c
definition of T(n)
T (2n)  (b  c log n)  c by induction hypothesis
T (2n)  b  c((log n)  1)
T (2n)  b  c((log n)  (log 2))
T (2n)  b  c log(2n)
Q.E.D.
Thus: T (n)   (log n)
Example: Sum of Integer Queue
sum_queue(Q){
if (Q.length() == 0 ) return 0;
else return Q.dequeue() +
sum_queue(Q); }
– One subproblem
– Linear reduction in size (decrease by 1)
Equation:
T(0) = b
T(n) = c + T(n – 1)
for n>0
Lower Bound Analysis:
Recursive Fibonacci
int Fib(n){
if (n == 0 or n == 1) return 1 ;
else return Fib(n - 1) + Fib(n - 2);
}
• Lower bound analysis (n)
• Instead of =, equations will use 
T(n)  Some expression
• Will simplify math by throwing out terms on the
right-hand side
Analysis by Repeated Subsitution
T (0)  T (1)  a
T (n)  b  T (n  1)  T ( n  2)
base case
recursive case
T (n)  b  2T (n  2)
simplify to smaller quantity
T (n)  b  2(b  2T (n  2  2))
substitute
T (n)  3b  4T (n  4))
T (n)  3b  4(b  2T (n  4  2))
T (n)  7b  8T (n  6))
simplify
substitute
simplify
T (n)  7b  8(b  2T (n  6  2))
T (n)  15b  16T ( n  8)
substitute
simplify
T (n)  (2k  1)b  2 k T (n  2k )
T (n)  (2n / 2  1)b  2n / 2 T (n  2(n / 2))
T (n)  2n / 2 (b  a )  b
T ( n)  ( 2 n / 2 )
inductive leap
choose k=(n/2)
simplify
Note: this is not the same as (2n )!!!
Learning from Analysis
• To avoid recursive calls
– store all basis values in a table
– each time you calculate an answer, store it in the table
– before performing any calculation for a value n
• check if a valid answer for n is in the table
• if so, return it
• Memoization
– a form of dynamic programming
• How much time does memoized version take?
Amortized Analysis
• Consider any sequence of operations applied to a
data structure
• Some operations may be fast, others slow
• Goal: show that the average time per operation is
still good
total time for n operations
n
Stack Abstract Data Type
A
• Stack operations
– push
– pop
– is_empty
E D C BA
B
C
D
E
F
F
• Stack property: if x is on the stack before y is
pushed, then x will be popped after y is popped
What is biggest problem with an array implementation?
Stretchy Stack Implementation
int[] data;
int maxsize;
int top;
Best case Push = O( )
Worst case Push = O( )
Push(e){
if (top == maxsize){
temp = new int[2*maxsize];
for (i=0;i<maxsize;i++) temp[i]=data[i]; ;
data = temp;
maxsize = 2*maxsize; }
else { data[++top] = e; }
Stretchy Stack Amortized Analysis
• Consider sequence of n operations
push(3); push(19); push(2); …
• What is the max number of stretches?
• What is the total time?
– let’s say a regular push takes time a, and stretching an array
contain k elements takes time bk.
• Amortized time =
Stretchy Stack Amortized Analysis
• Consider sequence of n operations
push(3); push(19); push(2); …
• What is the max number of stretches? log n
• What is the total time?
– let’s say a regular push takes time a, and stretching an array
contain k elements takes time bk.
log n
an  b(1  2  4  8  ...  n)  an  b  2i
i o
• Amortized time =
Geometric Series
N 1
A 1
A 

A 1
i 0
N
i
n 1
2 1
n 1
2 
 2 1

2 1
i 0
n
i
log n
2
i
i 0
log n 1
2
 1  (2
log n
)2  1  2n  1
1
Stretchy Stack Amortized Analysis
• Consider sequence of n operations
push(3); push(19); push(2); …
• What is the max number of stretches? log n
• What is the total time?
– let’s say a regular push takes time a, and stretching an array
contain k elements takes time bk.
log n
an  b(1  2  4  8  ...  n)  an  b  2i
i o
 an  b(2n  1)
• Amortized time =
an  b(2n  1)
(
n
)
Surprise
• In an asymptotic sense, there is no overhead in
using stretchy arrays rather than regular arrays!
Download