L01

advertisement
CS 235102
Data Structures
(資料結構)
Chapter 1:
Basic Concepts
Spring 2012
What is an algorithm?


An “algorithm” is a set of instructions that solves a
well-defined computational problem.
Algorithms must satisfy the following criteria:
 Input. Zero/more quantities are externally supplied.

Output. At least one quantity is produced.

Definiteness. Each instruction is clear and unambiguous.

Finiteness. Terminate after a finite no. of steps.

Effectiveness. Each instruction is basic and
feasible to be computed easily.
1
Describing an algorithm

Many ways to describe an algorithm:
 Use English sentences


Graphic representations (called Flowcharts)
• Work well only if algorithms are small and
simple
Use C language (mixing with English sentences)
• Our practice in our course
2
Binary Search

Assume we have n ≥ 1 distinct integers that
are sorted in array A[0 … n-1].
Determine if an integer x in the array.
If x=A[j], return index j;
otherwise return -1.
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7]
A

1
3
5
8
9
17
32
50
Eg. For x=9, return index 4;
For x=10, return -1.
3
Binary Search


Step 1. Find element A[mid] at middle position.
Step 2. If x = A[mid], then return mid;
If x < A[mid], then search left-half;
If x > A[mid], then search right-half;
4
Algorithm of Binary Search
int binsearch(int A[], int x)
{ int left=0, right=n-1, mid;
//search x in A[]
while (left <= right) { //more integers to check
// let A[mid] be the middle element
mid = (left+right)/2;
if (x == A[mid]) return mid;
if (x < A[mid]) right = mid-1;
if (x > A[mid]) left = mid+1;
}
return -1;
}
5
Recursive Algorithm

Direct recursion

Procedure calls itself directly
Eg. procA  proc A

Indirect recursion

Procedure calls other procedures that invoke
the calling procedure
Eg. procA  procB  procA

Some problem itself is defined recursively.
6
Binomial Coefficients

Binomial coefficient
C(n, m) =
n!
m! (n-m)!
can be computed by the recursive formula:
C(n, m) = C(n-1, m) + C(n-1, m-1)
where C(0, 0) = C(n, n) = 1.
7
Computing Binomial Coefficients
// Compute binomial coefficient C(n,m) recursively.
int bin_coeff(int n, m)
{
// termination conditions
if (m==n) then return 1;
else if (m==0) then return 1;
// recursive step
else
return bin_coeff(n-1,m) + bin_coeff(n-1,m-1);
}
8
Hints for Recursive Algorithms

To write a recursive algorithm, someone must
make sure to have


Termination conditions;
Parameter values decrease so that each call
brings us one step closer to a solution.
9
Recursive Binary Search
int binsearch(int A[], x, left, right)
{ int mid;
if (left <= right) { //more integers to check
// let A[mid] be the middle element
mid = (left+right)/2;
if (x == A[mid]) return
if (x < A[mid])
mid;
return binsearch(A, x, left, mid-1);
if (x > A[mid])
return binsearch(A, x, mid+1, right);
}
return -1;
}
10
A Running Example

Search for x=9 in array A[0…7] :
A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7]
A

1
3
5
8
9
17
1st 3rd
2nd
32
50
1st call: binsearch(A, 9, 0, 7)
2nd call: binsearch(A, 9, 4, 7)
3rd call: binsearch(A, 9, 4, 4)
return index 4.
11
Criteria for Good Programs

Many criteria to judge a program:







Meeting the original task specification?
Does it work correctly?
Documentation for using the program?
Modularity
Code readability
Although the above criteria are vitally important,
it is difficult to achieve them.
In order to achieve them, many real experience
and practice are needed.
12
Performance Analysis

Two criteria for performance analysis/evaluation:



How much memory space is needed?
How much running time is needed?
Performance analysis vs. performance measurement

Performance analysis:
•
•
•

machine independent
a prior estimate
heart of “complexity theory”
Performance measurement:
•
•
machine dependent
a posterior testing
13
Performance Analysis


Performance Analysis contains two parts:
Space Complexity and Time Complexity.
Space Complexity includes also two parts:

1st part -- A fixed part:
• independent of the no. and the size of input
and outputs.
•

includes instruction space, space for simple
variables, fixed-size structured variables,
constants.
2nd part -- A variable part: (see next slide …)
14
Performance Analysis

Space Complexity includes also two parts:

2nd part -- A variable part:
•
•

depends on the particular problem instance I
being solved.
includes recursion stack space, structured
variable whose size depends on the particular
instance I.
Thus Space Complexity S(P) = 1st part + 2nd part
= C + SP(I)
constant
Concentrate on
evaluate this term
15
Instance Characteristic (I)

Commonly used characteristics (I) include the
number, size and values of the inputs and output.
also two parts:

Eg. sorting(A[], n)
Then I= number of integers = n.

Eg. Summing 1 to n, i.e., 1+2+3+… n
Then I= value of n = n.
16
Space Complexity

Eg. (See Program 1.10 in the textbook.)
float abc(float a, b, c)
{
return a+b+b*c+(a+b-c)/(a+b)+4.00;
}
C = space for the program +
space for variables a, b, c, abc
= constant
SP(I) = 0
Thus S(P) = C + SP(I) = constant.

17
Iterative Summing
(Program 1.11)
// Compute the sum of n numbers in A[] iteratively.
float sum(float A[], int n)
{ float tempsum = 0;
int i;
for (i=0; i<n; i++)
tempsum += A[i];
return tempsum;
}


In C: -- Pass an array by reference only.
-- Pass other parameters by value.
In PASCAL: -- All parameters can be passed by
reference or value.
18
Iterative Summing
(Program 1.11)

Instance characteristic (I) = n (=size of array A[])

If passed by reference (eg. in C)


Ssum(I) = Ssum(n) = 0
If passed by value,

Ssum(I) = Ssum(n) = n
19
Recursive Summing
(Program 1.12)
// Compute the sum of n numbers in A[] recursively.
float rsum(float A[], int n)
{
if (n) return rsum(A, n-1) + A[n-1];
return A[0];
}




Instance characteristic (I) = n
Each call requires
4 ∙ ( 1 + 1 + 1) = 12 bytes
How many calls (recursive depth):
rsum(A, n)  rsum(A,n-1)  …  rsum(A, 0)
==> n+1 calls
Srsum(I) = Srsum(n) = 12 ∙ (n+1)
20
Time Complexity

Time taken by program P:
T(P) = Compile time + Running time
= TC + TP (I)
constant

How to evaluate TP (I) ??



Add, Sub, Multiply, … take different running time
Use “program step” to estimate running time
• “program step” = a program segment whose
execution time is independent of instance
characteristic I.
Eg. abc=a+b+b*c;
-- one program step
a=2;
-- one program step
21
Iterative Summing
(Program 1.11)
// Compute the sum of n numbers in A[] iteratively.
float sum(float A[], int n)
{ float tempsum = 0;
int i;
// 1 step
for (i=0; i<n; i++)
tempsum += A[i];
// n+1 steps
// n steps
return tempsum;
// 1 step
}


Instance charateric (I) = n (=size of array A[])
Tsum (I) = Tsum (n) = 1 + (n+1) + n + 1
= 2n + 3
22
Recursive Summing
(Program 1.12)
// Compute the sum of n numbers in A[] recursively.
float rsum(float A[], int n)
{ if (n)
return rsum(A, n-1) + A[n-1];
return A[0];
}


// 1 step
// 1 step
// 1 step
Instance characteristic (I) = n
Recurrence relation for Trsum(n) :


Trsum(0) = 2
Trsum(n) = 2 + Trsum(n-1)
= 2 + ( 2 + Trsum(n-2) )
=…
= 2n + Trsum(0) = 2n + 2
23
Matrix Addition
(Program 1.16)
// Compute the sum of n numbers in A[] iteratively.
void add(int a[][MAX_SIZE], b[][MAX_SIZE], c[][MAX_SIZE],
int rows, int cols)
{ int i, j ;
for (i=0; i<rows; i++) {
//rows+1 steps
for (j=0; j<cols; j++)
//rows(cols+1) steps
c[i][j] = a[i][j]+b[i][j]; //rows(cols) steps
}
}


Instance charateric (I) = rows(cols)
TP (rows, cols) = (rows+1) + rows(cols+1) + rows(cols)
= 2∙rows∙cols + 2∙rows + 1
24
Observation on Step Counts

In the previous examples :



Can we say that rsum is faster than sum ?


Tsum (n) = 2n + 3 steps
Trsum (n) = 2n + 2 steps
No. Since the execution time of each step is
different.
The step count is useful in that

“How the running time changes with changes in
the instance characteristic?”
25
Growth Rate of Time Complexity


Eg. For sum program, Tsum (n) = 2n + 3 “means”:

when n  10 fold ==> Tsum (n)  10 fold

sum program runs in linear time.
We only want to know the growth rate
(called asymptotic time)

Eg. Tsum (n) = 2n + 3 vs. Trsum (n) = 2n + 2
Then clearly Tsum (n) and Trsum (n) have the
same growth rate.
26
Asymptotic Notation (Big-O)



To compare the time complexity of two programs
that compute the same function.
To predict the growth rate in run time
as the instance characteristic increases.
Eg. Two programs with time complexity:

P1: c1 n2 + c2 n
P2: c3 n
Let c1 =1, c2 =2, and c3 =100. Then
• P1 faster, n2 + 2n ≤ 100n for n ≤ 98
• P2 faster, n2 + 2n > 100n for n > 98
27
Asymptotic Notation (Big-O)

Eg. Two programs with time complexity:


P1: c1 n2 + c2 n
P2: c3 n
Let c1 =1, c2 =2, and c3 =1000. Then
• P1 faster, n2 + 2n ≤ 100n for n ≤ 998
• P2 faster, n2 + 2n > 100n for n > 998
No matter what values c1, c2, and c3 are,
there will be n beyond which
c1 n2 + c2 n > c3 n
28
Definition of Big-O

Definition of O-notation:
f(n) = O(g(n)) iff these exist c, no>0 such that
f(n) ≤ c g(n) for all n ≥ no .

Eg. 3n + 2 = O(n)
since 3n+2 ≤ 4n for all n ≥ 2

Eg. 100n + 6 = O(n)
since 100n+6 ≤ 101 n for all n ≥ 10

Eg. 10n2 + 4n + 2 = O(n2)
since 10n2 + 4n + 2 ≤ 11 n2 for all n ≥5
29
Observation of Big-O

Leading constants and lower-order terms do not
matter.


Can always find a constant large enough to
make higher-order term swamp other terms.
Thm 1.2: If f(n) = amnm + … + a1n + ao,
then f(n) = O(nm).

Eg. 10n2 + 4n + 2 = O(n2)
30
Property of Big-O

Thm 1.2: If f(n) = amnm + … + a1n + ao,
then f(n) = O(nm).

Pf. f(n) ≤ |am| nm + … + |a1| n + |ao|
≤ nm (|am| + … + |a1| + |ao|)
≤ c nm , where c=|am| + … + |a1| + |ao|
for n ≥ 1
Hence f(n) = O(nm).
31
More Examples

0.1 n2 - 10n - 6 = O(n2)
always 1
We don’t write O(2n2)







n + log n = O(n)
n + n log n = O(n log n)
n2 + log n = O(n2)
2n + n10000 = O(2n)
n4 + 1000 n3 + n2 = O(n4)
n4 + 1000 n3 + n2 = O(n5)
n4 + 1000 n3 + n2 ≠ O(n3)
32
Naming Common Functions

O(1) -- constant time

O(log n) -- logarithmic time

O(n) -- linear time

O(n2) -- quadratic time

O(n3) -- cubic time

O(n100) -- polynomial time

O(2n) -- exponential time
When n is large
enough, the
latter terms
take more time
than the
former ones.
33
More Notes on Big-O

f(n) = O(g(n)) “means”
g(n) is an upper bound of f(n)



n = O(n)
n = O(n2)
n = O(n3)
We want g(n) as
small as possible !!
Big-O is usually used to compute the worst-case
running time of a program.
f(n) = O(g(n)) is correct; but
O(g(n)) = f(n) is wrong.
34
Compute Running Time in Big-O

How to compute the time complexity of a
program in big-Oh ?


Compute the total step-count,
then take big-Oh.
Take big-Oh on each step,
then sum up the big-Oh of all steps.
35
Rule of Sum

Thm: If f1(n) = O(g1(n)), and f2(n)=O(g2(n)),
then f1(n) + f2(n) = O(max(g1(n), g2(n)).



Eg. f1(n) = O(n)
f2(n) = O(n2)
Then f1(n) + f2(n) = O(n2).
Eg. f1(n) = O(n)
f2(2) = O(n)
Then f1(n) + f2(n) = O(n).
Used to compute segments of program P1 and P2
if P1 is followed by P2.
36
Rule of Product

Thm: If f1(n) = O(g1(n)), and f2(n)=O(g2(n)),
then f1(n) ∙ f2(n) = O(g1(n) ∙ g2(n)).


Eg. f1(n) = O(n)
f2(n) = O(n)
Then f1(n) ∙ f2(n) = O(n2).
Used in time analysis of nested loops
(See next slide …)
37
Rule of Product
for (i=0; i<n; i++) {
for (j=0; j<n; j++)
sum := sum + 1;
}

// O(n)
// O(n)
// O(1)
By rule of product, running time of this program:
f(n) = O(n ∙ n ∙ 1)
= O(n2).
38
Complexity of Binary Search
int binsearch(int A[], int x)
{ int left=0, right=n-1, mid;
while (left <= right) {
//search x in A[]
// O(log2 n)
// let A[mid] be the middle element
mid = (left+right)/2;
// O(1)
if (x == A[mid]) return mid;
if (x < A[mid]) right = mid-1;
if (x > A[mid]) left = mid+1;
}
return -1;
// O(1)
// O(1)
// O(1)
// O(1)
}
39
Complexity of Binary Search

Analysis of the while loop:





Iteration 1: n values to be searched
Iteration 2: n/2 left for searching
Iteration 3: n/4 left for searching
…
Iteraton k+1: n/(2k) left for searching
When n/(2k) = 1, searching must finish.
That is n = 2k
==> k = log2 n

Hence, worst-case running time of binary search
is O(log2 n).
40
Definition of Big-Ω

Definition of Ω-notation:
f(n) = Ω(g(n)) iff these exist c, no>0 such that
f(n) ≥ c g(n) for all n ≥ no .

Eg. 3n + 2 = Ω(n)
since 3n+2 ≥ 3n for all n ≥ 1

Eg. 100n + 6 = Ω(n)
since 100n+6 ≥ 100 n for all n ≥ 1

Eg. 10n2 + 4n + 2 = Ω(n2)
since 10n2 + 4n + 2 ≥ n2 for all n ≥1
41
Definition of Big-Θ

Definition of Θ-notation:
f(n) = Θ(g(n)) iff f(n) = O(g(n)) and f(n) = Ω(g(n)).

Eg. 3n + 2 = Θ(n)

Eg. 100n + 6 = Θ(n)

Eg. 10n2 + 4n + 2 = Θ(n2)
42
Plot of Common Function Values
43
Running Times On Computer
44
Performance Measurement


Obtain actual space and time requirement when
running a program.
How to do time measurement in C ?

Method 1: Use clock(), measured in clock ticks
•

Method 2: Use time(), measured in seconds
•

(See next slides for details …)
(See next slides for details …)
To time a short event, it is necessary to repeat it
many times, and then take their average.
45
Performance Measurement

Method 1: Use clock(), measured in clock ticks
#include <time.h>
void main()
{
clock_t start = clock();
// main body of program comes here!
clock_t stop = clock();
double duration = ((double) (stop-start))
/ CLOCKS_PER_SEC;
}
46
Performance Measurement

Method 2: Use time(), measured in seconds
#include <time.h>
void main()
{
time_t start = time(NULL);
// main body of program comes here!
time_t stop = time(NULL);
double duration = (double) difftime(stop,start);
}
47
Download